浏览代码

Merge pull request #42 from threatexpress/dev

Add Mcafee Web Gateway and Bluecoat Fixes
Andrew Chiles 2 年之前
父节点
当前提交
38cb7ef4dd
共有 4 个文件被更改,包括 183 次插入133 次删除
  1. 80 0
      CHANGELOG
  2. 11 66
      README.md
  3. 1 1
      dockerfile
  4. 91 66
      domainhunter.py

+ 80 - 0
CHANGELOG

@@ -0,0 +1,80 @@
+# CHANGELOG
+
+## 25 October 2022
+
++ Add McAfee Web Gateway (Cloud) reputation check by @froyo75 adopted from PR [#37](https://github.com/threatexpress/domainhunter/pull/37)
++ Bluecoat XSRF fixes by @froyo75 adopted from PR [#37](https://github.com/threatexpress/domainhunter/pull/37) (Huge thanks!)
++ Fix for ExpiredDomains login by @davidlebr1
++ Update Dockerfile
++ Separate this CHANGELOG from the README
+
+## 07 January 2021
+
++ Fix Symantec Site Review (Bluecoat) reputation checking to bypass XSRF and additional POST parameter checks
++ Temporary fix for broken malware domains link. This service is no longer offered in the form used by DomainHunter.
++ Add internal code comments for readability
++ Add check for ExpiredDomains username before asking for a password
++ Disable Google Safe Browsing/PhishTank reputation from MxToolbox as this service has changed
+
+## 21 February 2020
+
++ updated Pillow version to support Python3.7+
++ Add instructions to install using pipenv
+
+## 13 August 2019
+
++ Added authentication support for ExpiredDomains.net thanks to acole76!
+
+## 5 October 2018
+
++ Fixed logic for filtering domains with desirable categorizations. Previously, some error conditions weren't filtered and would result in domains without a valid categorization making it into the final list.
+
+## 4 October 2018
+
++ Tweaked parsing logic
++ Fixed changes parsed columns indexes
+
+## 17 September 2018
+
++ Fixed Symantec WebPulse Site Review parsing errors caused by service updates
+
+## 18 May 2018
+
++ Add --alexa switch to control Alexa ranked site filtering
+
+## 16 May 2018
+
++ Update queries to increase probability of quickly finding a domain available for instant purchase. Previously, many reported domains had an "In Auction" or "Make an Offer" status. New criteria: .com|.net|.org + Alexa Ranked + Available for Purchase
++ Improved logic to filter out uncategorized and some potentially undesirable domain categorizations in the final text table and HTML output
++ Removed unnecessary columns from HTML report
+
+## 6 May 2018
+
++ Fixed expired domains parsing when performing a keyword search
++ Minor HTML and text table output updates
++ Filtered reputation checks to only execute for .COM, .ORG, and .NET domains and removed check for Archive.org records when performing a default or keyword search. Credit to @christruncer for the original PR and idea.
+
+## 11 April 2018
+
++ Added OCR support for CAPTCHA solving with tesseract. Thanks to t94j0 for the idea in [AIRMASTER](https://github.com/t94j0/AIRMASTER)  
++ Added support for input file list of potential domains (-f/--filename)
++ Changed -q/--query switch to -k/--keyword to better match its purpose
++ Added additional error checking for ExpiredDomains.net parsing
+
+## 9 April 2018
+
++ Added -t switch for timing control. -t <1-5>
++ Added Google SafeBrowsing and PhishTank reputation checks
++ Fixed bug in IBMXForce response parsing
+
+## 7 April 2018
+
++ Fixed support for Symantec WebPulse Site Review (formerly Blue Coat WebFilter)
++ Added Cisco Talos Domain Reputation check
++ Added feature to perform a reputation check against a single non-expired domain. This is useful when monitoring reputation for domains used in ongoing campaigns and engagements.
+
+## 6 June 2017
+
++ Added python 3 support
++ Code cleanup and bug fixes
++ Added Status column (Available, Make Offer, Price, Backorder, etc)

+ 11 - 66
README.md

@@ -6,77 +6,20 @@ Domain name selection is an important aspect of preparation for penetration test
 
 This Python based tool was written to quickly query the Expireddomains.net search engine for expired/available domains with a previous history of use. It then optionally queries for domain reputation against services like Symantec Site Review (BlueCoat), IBM X-Force, and Cisco Talos. The primary tool output is a timestamped HTML table style report.
 
-## Changelog
-
-- 07 January 2021
-   + Fix Symantec Site Review (Bluecoat) reputation checking to bypass XSRF and additional POST parameter checks
-   + Temporary fix for broken malware domains link. This service is no longer offered in the form used by DomainHunter.
-   + Add internal code comments for readability
-   + Add check for ExpiredDomains username before asking for a password
-   + Disable Google Safe Browsing/PhishTank reputation from MxToolbox as this service has changed
-
-- 21 February 2020
-   + updated Pillow version to support Python3.7+
-   + Add instructions to install using pipenv
-
-- 13 August 2019
-   + Added authentication support for ExpiredDomains.net thanks to acole76!
-   
-- 5 October 2018
-   + Fixed logic for filtering domains with desirable categorizations. Previously, some error conditions weren't filtered and would result in domains without a valid categorization making it into the final list.
-
-- 4 October 2018
-   + Tweaked parsing logic
-   + Fixed changes parsed columns indexes
-
-- 17 September 2018
-    + Fixed Symantec WebPulse Site Review parsing errors caused by service updates
-
-- 18 May 2018
-    + Add --alexa switch to control Alexa ranked site filtering
-
-- 16 May 2018
-    + Update queries to increase probability of quickly finding a domain available for instant purchase. Previously, many reported domains had an "In Auction" or "Make an Offer" status. New criteria: .com|.net|.org + Alexa Ranked + Available for Purchase
-    + Improved logic to filter out uncategorized and some potentially undesirable domain categorizations in the final text table and HTML output
-    + Removed unnecessary columns from HTML report
-
-- 6 May 2018
-    + Fixed expired domains parsing when performing a keyword search
-    + Minor HTML and text table output updates
-    + Filtered reputation checks to only execute for .COM, .ORG, and .NET domains and removed check for Archive.org records when performing a default or keyword search. Credit to @christruncer for the original PR and idea.
-
-- 11 April 2018
-    + Added OCR support for CAPTCHA solving with tesseract. Thanks to t94j0 for the idea in [AIRMASTER](https://github.com/t94j0/AIRMASTER)  
-    + Added support for input file list of potential domains (-f/--filename)
-    + Changed -q/--query switch to -k/--keyword to better match its purpose
-    + Added additional error checking for ExpiredDomains.net parsing
-
-- 9 April 2018
-    + Added -t switch for timing control. -t <1-5>
-    + Added Google SafeBrowsing and PhishTank reputation checks
-    + Fixed bug in IBMXForce response parsing
-
-- 7 April 2018
-    + Fixed support for Symantec WebPulse Site Review (formerly Blue Coat WebFilter)
-    + Added Cisco Talos Domain Reputation check
-    + Added feature to perform a reputation check against a single non-expired domain. This is useful when monitoring reputation for domains used in ongoing campaigns and engagements.
-
-- 6 June 2017
-    + Added python 3 support
-    + Code cleanup and bug fixes
-    + Added Status column (Available, Make Offer, Price, Backorder, etc)
+See [CHANGELOG](./CHANGELOG) for history of updates and release notes!
 
 ## Features
 
 - Retrieve specified number of recently expired and deleted domains (.com, .net, .org) from ExpiredDomains.net
+  - Note: You will need credentials from expireddomains.net for full functionality
 - Retrieve available domains based on keyword search from ExpiredDomains.net
-- Perform reputation checks against the Symantec WebPulse Site Review (BlueCoat), IBM x-Force, Cisco Talos, Google SafeBrowsing, and PhishTank services
+- Perform reputation checks against the Symantec WebPulse Site Review (BlueCoat), IBM x-Force, and Cisco Talos
 - Sort results by domain age (if known) and filter for reputation
 - Text-based table and HTML report output with links to reputation sources and Archive.org entry
 
 ## Installation
 
-__Direct Installation__
+### Direct Installation
 
 Install Python requirements
 
@@ -88,7 +31,7 @@ Optional - Install additional OCR support dependencies
 
 - MAC OSX: `brew install tesseract`
 
-__pipenv installation__
+### pipenv installation
 
     pipenv --python 3.7
     pipenv install
@@ -97,9 +40,13 @@ Optional - Install additional OCR support dependencies
 
 - Debian/Ubuntu: `apt-get install tesseract-ocr python3-pil`
 
-## Tip
+### Docker
 
-You will need credentials from expireddomains.net for full functionality
+1. Build the image
+`docker build -t domainhunter .`
+
+2. Run it with your arguments
+`docker run -it domainhunter [args]`
 
 ## Usage
 
@@ -155,8 +102,6 @@ Perform all reputation checks for a single domain
     [*] Downloading malware domain list from http://mirror1.malwaredomains.com/files/justdomains
 
     [*] Fetching domain reputation for: mydomain.com
-    [*] Google SafeBrowsing and PhishTank: mydomain.com
-    [+] mydomain.com: No issues found
     [*] BlueCoat: mydomain.com
     [+] mydomain.com: Technology/Internet
     [*] IBM xForce: mydomain.com

+ 1 - 1
dockerfile

@@ -3,7 +3,7 @@
 #run it:
 #docker run -it domainhunter:1.0 [args]
 
-FROM ubuntu:16.04
+FROM python:3
 
 RUN apt-get update \
 	&& apt-get install python3-pip -y\

+ 91 - 66
domainhunter.py

@@ -19,7 +19,10 @@ import sys
 from urllib.parse import urlparse
 import getpass
 
-__version__ = "20210107"
+# Bluecoat XSRF
+from hashlib import sha256
+
+__version__ = "20221025"
 
 ## Functions
 
@@ -93,11 +96,22 @@ def checkBluecoat(domain):
             'U2NyaXB0aW5nIGFnYWluc3QgU2l0ZSBSZXZpZXcgaXMgYWdhaW5zdCB0aGUgU2l0ZSBSZXZpZXcgVGVybXMgb2YgU2VydmljZQ=='
         ]
         
+        # New Bluecoat XSRF Code added May 2022 thanks to @froyo75
+        xsrf_token_parts = token.split('-')
+        xsrf_random_part = random.choice(xsrf_token_parts)
+        key_data = xsrf_random_part + ': ' + token
+        # Key used as part of POST data
+        key = sha256(key_data.encode('utf-8')).hexdigest()
+        random_phrase = base64.b64decode(random.choice(phrases)).decode('utf-8')
+        phrase_data = xsrf_random_part + ': ' + random_phrase
+        # Phrase used as part of POST data
+        phrase = sha256(phrase_data.encode('utf-8')).hexdigest()
+        
         postData = {
             'url':domain,
             'captcha':'',
-            'key':'%032x' % random.getrandbits(256), # Generate a random 256bit "hash-like" string
-            'phrase':random.choice(phrases), # Pick a random base64 phrase from the list
+            'key':key,
+            'phrase':phrase, # Pick a random base64 phrase from the list
             'source':'new-lookup'}
 
         headers = {'User-Agent':useragent,
@@ -139,6 +153,7 @@ def checkBluecoat(domain):
                         response = s.get(url=captchasolutionURL,headers=headers,verify=False,proxies=proxies)
 
                         # Try the categorization request again
+
                         response = s.post('https://sitereview.bluecoat.com/resource/lookup',headers=headers,json=postData,verify=False,proxies=proxies)
 
                         responseJSON = json.loads(response.text)
@@ -227,64 +242,68 @@ def checkTalos(domain):
         print('[-] Error retrieving Talos reputation! {0}'.format(e))
         return "error"
 
-def checkMXToolbox(domain):
-    """ Checks the MXToolbox service for Google SafeBrowsing and PhishTank information. Currently broken"""
-    url = 'https://mxtoolbox.com/Public/Tools/BrandReputation.aspx'
-    headers = {'User-Agent':useragent,
-            'Origin':url,
-            'Referer':url}  
+def checkMcAfeeWG(domain):
+    """McAfee Web Gateway Domain Reputation"""
 
-    print('[*] Google SafeBrowsing and PhishTank: {}'.format(domain))
-    
     try:
-        response = s.get(url=url, headers=headers,proxies=proxies,verify=False)
-        
-        soup = BeautifulSoup(response.content,'lxml')
-
-        viewstate = soup.select('input[name=__VIEWSTATE]')[0]['value']
-        viewstategenerator = soup.select('input[name=__VIEWSTATEGENERATOR]')[0]['value']
-        eventvalidation = soup.select('input[name=__EVENTVALIDATION]')[0]['value']
-
-        data = {
-        "__EVENTTARGET": "",
-        "__EVENTARGUMENT": "",
-        "__VIEWSTATE": viewstate,
-        "__VIEWSTATEGENERATOR": viewstategenerator,
-        "__EVENTVALIDATION": eventvalidation,
-        "ctl00$ContentPlaceHolder1$brandReputationUrl": domain,
-        "ctl00$ContentPlaceHolder1$brandReputationDoLookup": "Brand Reputation Lookup",
-        "ctl00$ucSignIn$hfRegCode": 'missing',
-        "ctl00$ucSignIn$hfRedirectSignUp": '/Public/Tools/BrandReputation.aspx',
-        "ctl00$ucSignIn$hfRedirectLogin": '',
-        "ctl00$ucSignIn$txtEmailAddress": '',
-        "ctl00$ucSignIn$cbNewAccount": 'cbNewAccount',
-        "ctl00$ucSignIn$txtFullName": '',
-        "ctl00$ucSignIn$txtModalNewPassword": '',
-        "ctl00$ucSignIn$txtPhone": '',
-        "ctl00$ucSignIn$txtCompanyName": '',
-        "ctl00$ucSignIn$drpTitle": '',
-        "ctl00$ucSignIn$txtTitleName": '',
-        "ctl00$ucSignIn$txtModalPassword": ''
-        }
-          
-        response = s.post(url=url, headers=headers, data=data,proxies=proxies,verify=False)
+        print('[*] McAfee Web Gateway (Cloud): {}'.format(domain))
+
+        # HTTP Session container, used to manage cookies, session tokens and other session information
+        s = requests.Session()
+
+        headers = {
+                'User-Agent':useragent,
+                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
+                'Accept-Language': 'en-US,en;q=0.5',
+                'Accept-Encoding': 'gzip, deflate',
+                'Referer':'https://sitelookup.mcafee.com/'
+                }  
 
-        soup = BeautifulSoup(response.content,'lxml')
+        # Establish our session information
+        response = s.get("https://sitelookup.mcafee.com",headers=headers,verify=False,proxies=proxies)
+
+        # Pull the hidden attributes from the response
+        soup = BeautifulSoup(response.text,"html.parser")
+        hidden_tags = soup.find_all("input",  {"type": "hidden"})
+        for tag in hidden_tags:
+            if tag['name'] == 'sid':
+                sid = tag['value']
+            elif tag['name'] == 'e':
+                e = tag['value']
+            elif tag['name'] == 'c':
+                c = tag['value']
+            elif tag['name'] == 'p':
+                p = tag['value']
+
+        # Retrieve the categorization infos 
+        multipart_form_data = {
+            'sid': (None, sid),
+            'e': (None, e),
+            'c': (None, c),
+            'p': (None, p),
+            'action': (None, 'checksingle'),
+            'product': (None, '14-ts'),
+            'url': (None, domain)
+        }
 
-        a = ''
-        if soup.select('div[id=ctl00_ContentPlaceHolder1_noIssuesFound]'):
-            a = 'No issues found'
+        response = s.post('https://sitelookup.mcafee.com/en/feedback/url',headers=headers,files=multipart_form_data,verify=False,proxies=proxies)
+        if response.status_code == 200:
+            soup = BeautifulSoup(response.text,"html.parser")
+            for table in soup.findAll("table", {"class": ["result-table"]}):
+                datas = table.find_all('td')
+                if "not valid" in datas[2].text:
+                    a = 'Uncategorized'
+                else:
+                    status = datas[2].text
+                    category = (datas[3].text[1:]).strip().replace('-',' -')
+                    web_reputation = datas[4].text
+                    a = '{0}, Status: {1}, Web Reputation: {2}'.format(category,status,web_reputation)
             return a
         else:
-            if soup.select('div[id=ctl00_ContentPlaceHolder1_googleSafeBrowsingIssuesFound]'):
-                a = 'Google SafeBrowsing Issues Found. '
-        
-            if soup.select('div[id=ctl00_ContentPlaceHolder1_phishTankIssuesFound]'):
-                a += 'PhishTank Issues Found'
-            return a
+            raise Exception
 
     except Exception as e:
-        print('[-] Error retrieving Google SafeBrowsing and PhishTank reputation!')
+        print('[-] Error retrieving McAfee Web Gateway Domain Reputation!')
         return "error"
 
 def downloadMalwareDomains(malwaredomainsURL):
@@ -315,25 +334,21 @@ def checkDomain(domain):
     ciscotalos = checkTalos(domain)
     print("[+] {}: {}".format(domain, ciscotalos))
 
-    #This service has completely changed, removing for now
-    #mxtoolbox = checkMXToolbox(domain)
-    #print("[+] {}: {}".format(domain, mxtoolbox))
-    mxtoolbox = "-"
-
     umbrella = "not available"
     if len(umbrella_apikey):
         umbrella = checkUmbrella(domain)
         print("[+] {}: {}".format(domain, umbrella))
 
+    mcafeewg = checkMcAfeeWG(domain)
+    print("[+] {}: {}".format(domain, mcafeewg))
+
     print("")
     
-    results = [domain,bluecoat,ibmxforce,ciscotalos,umbrella,mxtoolbox]
+    results = [domain,bluecoat,ibmxforce,ciscotalos,umbrella,mcafeewg]
     return results
 
 def solveCaptcha(url,session):  
-    """Downloads CAPTCHA image and saves to current directory for OCR with tesseract
-    Returns CAPTCHA string or False if error occured
-    """
+    """Downloads CAPTCHA image and saves to current directory for OCR with tesseract"""
     
     jpeg = 'captcha.jpg'
     
@@ -377,6 +392,7 @@ def loginExpiredDomains():
     """Login to the ExpiredDomains site with supplied credentials"""
 
     data = "login=%s&password=%s&redirect_2_url=/begin" % (username, password)
+    
     headers["Content-Type"] = "application/x-www-form-urlencoded"
     r = s.post(expireddomainHost + "/login/", headers=headers, data=data, proxies=proxies, verify=False, allow_redirects=False)
     cookies = s.cookies.get_dict()
@@ -553,7 +569,7 @@ If you plan to use this content for illegal purpose, don't.  Have a nice day :)\
                     doSleep(timing)
 
                 # Print results table
-                header = ['Domain', 'BlueCoat', 'IBM X-Force', 'Cisco Talos', 'Umbrella', 'MXToolbox']
+                header = ['Domain', 'BlueCoat', 'IBM X-Force', 'Cisco Talos', 'Umbrella', 'McAfee Web Gateway (Cloud)']
                 print(drawTable(header,data))
 
         except KeyboardInterrupt:
@@ -719,15 +735,23 @@ If you plan to use this content for illegal purpose, don't.  Have a nice day :)\
                     if umbrella not in unwantedResults:
                         print("[+] Umbrella {}: {}".format(domain, umbrella))
 
+                mcafeewg = checkMcAfeeWG(domain)
+                if mcafeewg not in unwantedResults:
+                    print("[+] McAfee Web Gateway (Cloud) {}: {}".format(domain, mcafeewg))
+
                 print("")
                 # Sleep to avoid captchas
                 doSleep(timing)
 
             # Append entry to new list with reputation if at least one service reports reputation
-            if not ((bluecoat in ('Uncategorized','badurl','Suspicious','Malicious Sources/Malnets','captcha','Phishing','Placeholders','Spam','error')) \
-                and (ibmxforce in ('Not found.','error')) and (ciscotalos in ('Uncategorized','error')) and (umbrella in ('Uncategorized','None'))):
+            if not (\
+                (bluecoat in ('Uncategorized','badurl','Suspicious','Malicious Sources/Malnets','captcha','Phishing','Placeholders','Spam','error')) \
+                and (ibmxforce in ('Not found.','error')) \
+                and (ciscotalos in ('Uncategorized','error')) \
+                and (umbrella in ('Uncategorized','None')) \
+                and (mcafeewg in ('Uncategorized','error'))):
                 
-                data.append([domain,birthdate,archiveentries,availabletlds,status,bluecoat,ibmxforce,ciscotalos,umbrella])
+                data.append([domain,birthdate,archiveentries,availabletlds,status,bluecoat,ibmxforce,ciscotalos,umbrella,mcafeewg])
 
     # Sort domain list by column 2 (Birth Year)
     sortedDomains = sorted(data, key=lambda x: x[1], reverse=True) 
@@ -777,6 +801,7 @@ If you plan to use this content for illegal purpose, don't.  Have a nice day :)\
         htmlTableBody += '<td><a href="https://exchange.xforce.ibmcloud.com/url/{}" target="_blank">{}</a></td>'.format(i[0],i[6]) # IBM x-Force Categorization
         htmlTableBody += '<td><a href="https://www.talosintelligence.com/reputation_center/lookup?search={}" target="_blank">{}</a></td>'.format(i[0],i[7]) # Cisco Talos
         htmlTableBody += '<td>{}</td>'.format(i[8]) # Cisco Umbrella
+        htmlTableBody += '<td><a href="https://sitelookup.mcafee.com/en/feedback/url?action=checksingle&url=http%3A%2F%2F{}&product=14-ts" target="_blank">{}</a></td>'.format(i[0],i[9]) # McAfee Web Gateway (Cloud)
         htmlTableBody += '<td><a href="http://www.borderware.com/domain_lookup.php?ip={}" target="_blank">WatchGuard</a></td>'.format(i[0]) # Borderware WatchGuard
         htmlTableBody += '<td><a href="https://www.namecheap.com/domains/registration/results.aspx?domain={}" target="_blank">Namecheap</a></td>'.format(i[0]) # Namecheap
         htmlTableBody += '<td><a href="http://web.archive.org/web/*/{}" target="_blank">Archive.org</a></td>'.format(i[0]) # Archive.org