SSL Site Bypass with Squeak

Question

SSL Site Bypass with Squeak

I need to scan https://dms.psc.sc.gov/Web/dockets which uses TLS v1.2 using scrapy framework. But when requesting the url, it cannot load and pick up [<twisted.python.failure.Failure <class 'OpenSSL.SSL.Error'>>].

There is a problem discussed on git https://github.com/scrapy/scrapy/issues/981 , but this did not work for me. I have scrapy v 0.24.5 and a twisted version> = 14.

When I try to crawl another site that also uses TLS v1.2, it works, but not for https://dms.psc.sc.gov . How to solve this problem?

+4

python ssl scrapy

Hassan raza Jun 24 '15 at 13:12

source share

4 answers

Pawel Miech · Answer 1 · 2015-06-24T13:36:25+0000

PR, Scrapy . ( 2016 )

, Scrapy , .

, HTTP Scrapy, :

Scrapy
, Twisted, , Twisted ( 14 , , SSL).

- Scrapy Twisted, ScrapyClientContextFactory - . .

github

Vinodh Velumayil · Answer 2 · 2015-07-22T07:54:11+0000

1. DOWNLOADER_CLIENTCONTEXTFACTORY='testproject.CustomContext.CustomClientContextFactory'

2. CustomContext.py

from OpenSSL import SSL
from twisted.internet.ssl import ClientContextFactory
from twisted.internet._sslverify import ClientTLSOptions
from scrapy.core.downloader.contextfactory import ScrapyClientContextFactory
class CustomClientContextFactory(ScrapyClientContextFactory):

    def getContext(self, hostname=None, port=None):
        ctx = ClientContextFactory.getContext(self)
        # Enable all workarounds to SSL bugs as documented by
        # http://www.openssl.org/docs/ssl/SSL_CTX_set_options.html
        ctx.set_options(SSL.OP_ALL)
        if hostname:
            ClientTLSOptions(hostname, ctx)
        return ctx

.. https Windows, Ubuntu 14.04, , : -

from twisted.internet._sslverify import ClientTLSOptions
exceptions.ImportError: cannot import name ClientTLSOptions

, - .

EDIT:

from twisted.internet._sslverify import ClientTLSOptions

try:
    # available since twisted 14.0
    from twisted.internet._sslverify import ClientTLSOptions
except ImportError:
    ClientTLSOptions = None

Zoltán · Answer 3 · 2017-03-17T18:44:17+0000

Anyone having "TypeError: unbound method getContext () should be called with an instance of ClientContextFactory as the first argument ..."

Replace ctx = ClientContextFactory.getContext(self)

with ctx = ScrapyClientContextFactory.getContext(self)

goodgrief · Answer 4 · 2017-04-11T13:07:17+0000

The answer to the question of Vinodh Velumayil is right. But I had to edit this line:

ctx = ClientContextFactory.getContext(self)

:

inst = ClientContextFactory()
ctx = inst.getContext()

SSL Site Bypass with Squeak

More articles: