Open web in new Selenium + Python tab

So, I'm trying to open websites on new tabs inside my WebDriver. I want to do this because opening a new WebDriver for each website takes about 3.5 seconds using PhantomJS, I want more speed ...

I am using a multiprocessor python script and I want to get some elements from each page, so the workflow looks like this:

Open Browser Loop throught my array For element in array -> Open website in new tab -> do my business -> close it 

But I can’t find a way to achieve this.

Here is the code I'm using. This happens forever between websites, I need it to be fast ... Other tools are allowed, but I don’t know too many tools for recycling the contents of a website loaded with JavaScript (divs generated when an event occurs at boot, etc.). why I need Selenium ... BeautifulSoup cannot be used for some of my pages.

 #!/usr/bin/env python import multiprocessing, time, pika, json, traceback, logging, sys, os, itertools, urllib, urllib2, cStringIO, mysql.connector, shutil, hashlib, socket, urllib2, re from selenium import webdriver from selenium.webdriver.common.keys import Keys from PIL import Image from os import listdir from os.path import isfile, join from bs4 import BeautifulSoup from pprint import pprint def getPhantomData(parameters): try: # We create WebDriver browser = webdriver.Firefox() # Navigate to URL browser.get(parameters['target_url']) # Find all links by Selector links = browser.find_elements_by_css_selector(parameters['selector']) result = [] for link in links: # Extract link attribute and append to our list result.append(link.get_attribute(parameters['attribute'])) browser.close() browser.quit() return json.dumps({'data': result}) except Exception, err: browser.close() browser.quit() print err def callback(ch, method, properties, body): parameters = json.loads(body) message = getPhantomData(parameters) if message['data']: ch.basic_ack(delivery_tag=method.delivery_tag) else: ch.basic_reject(delivery_tag=method.delivery_tag, requeue=True) def consume(): credentials = pika.PlainCredentials('invitado', 'invitado') rabbit = pika.ConnectionParameters('localhost',5672,'/',credentials) connection = pika.BlockingConnection(rabbit) channel = connection.channel() # Conectamos al canal channel.queue_declare(queue='com.stuff.images', durable=True) channel.basic_consume(callback,queue='com.stuff.images') print ' [*] Waiting for messages. To exit press CTRL^C' try: channel.start_consuming() except KeyboardInterrupt: pass workers = 5 pool = multiprocessing.Pool(processes=workers) for i in xrange(0, workers): pool.apply_async(consume) try: while True: continue except KeyboardInterrupt: print ' [*] Exiting...' pool.terminate() pool.join() 
+18
source share
8 answers

You can achieve opening / closing a tab using the keyboard shortcut COMMAND + T or COMMAND + W (OSX). In other OSs, you can use CONTROL + T / CONTROL + W.

In selenium, you can imitate this behavior. You will need to create one web driver and as many tabs as you need tests.

Here is the code.

 from selenium import webdriver from selenium.webdriver.common.keys import Keys driver = webdriver.Firefox() driver.get("http://www.google.com/") #open tab driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 't') # You can use (Keys.CONTROL + 't') on other OSs # Load a page driver.get('http://stackoverflow.com/') # Make the tests... # close the tab # (Keys.CONTROL + 'w') on other OSs. driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 'w') driver.close() 
+29
source

This is common code adapted from other examples:

 from selenium import webdriver from selenium.webdriver.common.keys import Keys driver = webdriver.Firefox() driver.get("http://www.google.com/") #open tab # ... take the code from the options below # Load a page driver.get('http://bings.com') # Make the tests... # close the tab driver.quit() 

Possible ways:

  • Sending <CTRL> + <T> to one element

     #open tab driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't') 
  • Sending <CTRL> + <T> via action chains

     ActionChains(driver).key_down(Keys.CONTROL).send_keys('t').key_up(Keys.CONTROL).perform() 
  • Javascript fragment execution

     driver.execute_script('''window.open("http://bings.com","_blank");''') 

    To achieve this, you need to make sure that the preferences are browser.link.open_newwindow and browser.link.open_newwindow.restriction . The default values ​​in recent versions are fine, otherwise you might need to:

     fp = webdriver.FirefoxProfile() fp.set_preference("browser.link.open_newwindow", 3) fp.set_preference("browser.link.open_newwindow.restriction", 2) driver = webdriver.Firefox(browser_profile=fp) 

    the problem is that these preferences are given different values and frozen , at least selenium 3.4.0. When you use a profile to set them using a java binding, an exception is thrown and with a python binding, the new values ​​are ignored.

    In Java, there is a way to set these settings without specifying a profile object when talking to geckodriver , but it is not yet implemented in the python binding:

     FirefoxOptions options = new FirefoxOptions().setProfile(fp); options.addPreference("browser.link.open_newwindow", 3); options.addPreference("browser.link.open_newwindow.restriction", 2); FirefoxDriver driver = new FirefoxDriver(options); 

Third option: stop working for python in selenium 3.4.0.

The first two options also seemed to stop working in selenium 3.4.0. They depend on sending the CTRL key event to the element. At first glance, this seems to be a CTRL key problem, but it fails due to the new Firefox multiprocessing feature. Maybe this new architecture is imposing new ways to do this, or maybe it's a temporary implementation problem. In any case, we can disable it through:

 fp = webdriver.FirefoxProfile() fp.set_preference("browser.tabs.remote.autostart", False) fp.set_preference("browser.tabs.remote.autostart.1", False) fp.set_preference("browser.tabs.remote.autostart.2", False) driver = webdriver.Firefox(browser_profile=fp) 

... and then you can successfully use the first method.

+16
source
 browser.execute_script('''window.open("http://bings.com","_blank");''') 

Where the browser is webDriver

+13
source

In the discussion, Simon clearly mentioned that:

Although the data type used to store the descriptor list can be ordered by insertion, the order in which the WebDriver implementation iterates over the window descriptors to insert them does not require stability. The order is arbitrary.


Using Selenium v3.x, opening a website in a new tab through Python is now much easier. We need to call WebDriverWait for number_of_windows_to_be(2) , and then collect the window handles each time we open a new tab / window, and finally switchTo().window(newly_opened) over the window handles and switchTo().window(newly_opened) as necessary. Here is a solution where you can open http://www.google.co.in in the source tab and https://www.yahoo.com in the neighboring tab:

  • Code block:

     from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_argument('disable-infobars') driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe') driver.get("http://www.google.co.in") print("Initial Page Title is : %s" %driver.title) windows_before = driver.current_window_handle print("First Window Handle is : %s" %windows_before) driver.execute_script("window.open('https://www.yahoo.com')") WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2)) windows_after = driver.window_handles new_window = [x for x in windows_after if x != windows_before][0] driver.switch_to_window(new_window) print("Page Title after Tab Switching is : %s" %driver.title) print("Second Window Handle is : %s" %new_window) 
  • Console output:

     Initial Page Title is : Google First Window Handle is : CDwindow-B2B3DE3A222B3DA5237840FA574AF780 Page Title after Tab Switching is : Yahoo Second Window Handle is : CDwindow-D7DA7666A0008ED91991C623105A2EC4 
  • Browser Snapshot:

multiple__tabs


Outro

A based discussion can be found in the Best way to track and iterate over tabs and windows using WindowHandles using Selenium.

+5
source

After a long work, the method below worked for me:

 driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 't') driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + Keys.TAB) windows = driver.window_handles time.sleep(3) driver.switch_to.window(windows[1]) 
+3
source

The basis of the definition on the selenium site: First of all, to automate web applications for testing purposes, but, of course, is not limited to this. The boring administration tasks over the Internet can (and should!) Also be automated. you see that the main purpose of selenium is for testing, and in addition, you can automate administrative tasks without having to browse websites. working with such a crawl thing is just a waste of time. I think you should focus on scanning, take a look at http://scrapy.org, this is the most common structure in python for retrieving data from websites.

0
source

I tried for a very long time to duplicate tabs in Chrome using action_keys and send_keys on the body. The only thing that worked for me is the answer here . This is how my duplicate tab definitely looked, probably not the best, but it works great for me.

 def duplicate_tabs(number, chromewebdriver): #Once on the page we want to open a bunch of tabs url = chromewebdriver.current_url for i in range(number): print('opened tab: '+str(i)) chromewebdriver.execute_script("window.open('"+url+"', 'new_window"+str(i)+"')") 

It basically runs Java from within Python, which is incredibly useful. Hope this helps someone.

Note: I am using Ubuntu, this should not make any difference, but if this does not work for you, this may be the reason.

0
source

Oddly enough, there are so many answers, and they all use surrogates like JS and keyboard shortcuts, instead of just using the selenium function:

 def newTab(driver, url="about:blank"): wnd = driver.execute(selenium.webdriver.common.action_chains.Command.NEW_WINDOW) handle = wnd["value"]["handle"] driver.switch_to.window(handle) driver.get(url) # changes the handle return driver.current_window_handle 
-1
source

Source: https://habr.com/ru/post/1213053/


All Articles