The article describes the methodology of external intelligence, which is a key aspect in the process of pentesting and cyber security. The article details the various steps and techniques that help security researchers and pentesters identify and analyze assets and resources belonging to a target organization or company.
So, you’ve been told that everything owned by a company is within the boundaries, and you want to find out what that company actually owns. The goal of this stage is to obtain all the companies belonging to the parent company, and then all the assets of those companies.
For this we are going to:
Find the acquisition of the main company, this will give us companies within the area.
Find the ASN (if any) of each company, this will give us the IP address ranges belonging to each company
Use a reverse whois lookup to find other records (organization names, domains…) related to the first one (this can be done recursively)
Use other methods like shodan org and sslfilters to search for other resources (this ssl trick can be done recursively).
First of all, we need to know what other companies belong to the main company. One option is to visit https://www.crunchbase.com/ , find the main company and click “acquire”. There you will see other companies acquired by the main one. Another option is to visit the parent company’s Wikipedia page and search for acquisitions.
Okay, at this point you should know all the companies that are in scope. Let’s find out how to find their assets.
An Autonomous System Number (ASN) is a unique number assigned to an Autonomous System (AS) by the Internet Numbering Authority (IANA). AS consists of blocks of IP addresses that have a clearly defined access policy to external networks and are administered by one organization, but may consist of several operators.
It is interesting to know if the company has assigned any ASN to lookup its IP address ranges. It will be interesting to perform a vulnerability test on all hosts within the scope and look for domains in those IP addresses. You can search by company name, IP address or domain on the page https://bgp.he.net/. Depending on the company’s region, these links may be useful for collecting additional data: AFRINIC (Africa), Arin (North America), APNIC (Asia), LACNIC (Latin America), RIPE NCC (Europe). In any case, probably all the useful information (IP ranges and Whois) appears already in the first link.
#You can try "automate" this with amass, but it's not very recommended amass intel -org tesla amass intel -asn 8911,50313,394161
In addition, the BBOT subdomain list automatically aggregates and summarizes ASNs at the end of the scan.
bbot -t tesla.com -f subdomain-enum ... [INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+ [INFO] bbot.modules.asn: | AS394161 | 8.244.131.0/24 | 5 | TESLA | Tesla Motors, Inc. | US | [INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+ [INFO] bbot.modules.asn: | AS16509 | 54.148.0.0/15 | 4 | AMAZON-02 | Amazon.com, Inc. | US | [INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+ [INFO] bbot.modules.asn: | AS394161 | 8.45.124.0/24 | 3 | TESLA | Tesla Motors, Inc. | US | [INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+ [INFO] bbot.modules.asn: | AS3356 | 8.32.0.0/12 | 1 | LEVEL3 | Level 3 Parent, LLC | US | [INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+ [INFO] bbot.modules.asn: | AS3356 | 8.0.0.0/9 | 1 | LEVEL3 | Level 3 Parent, LLC | US | [INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+
Organization IP address ranges can also be found using http://asnlookup.com/ (it has a free API). You can determine the IP and ASN of a domain using http://ipv4info.com/.
At this point we know all the assets within the domain , so if you are allowed, you can run some kind of vulnerability scanner (Nessus, OpenVAS) on all hosts. Alternatively, you can run a few port scans or use services like shodan to find open ports , and depending on what you find, you should take a look at this book to check a few possible services running. Also, it might be worth mentioning that you can also prepare some lists of default usernames and passwords and try to bruteforce the services using https://github.com/x90skysn3k/brutespray .
We know all the in-scope companies and their assets, it’s time to find in-scope domains.
Please note that you can also find subdomains in the methods below, and this information should not be underestimated.
First of all, you should find the main domain(s) of each company. For example, for Tesla Inc. will be tesla.com.
Now that you’ve found all the domains’ IP address ranges, you can try doing a reverse DNS lookup on those IP addresses to find more domains within . Try to use some victim dns server or some known dns server (1.1.1.1, 8.8.8.8)
dnsrecon -r <DNS Range> -n <IP_DNS> #DNS reverse of all of the addresses dnsrecon -d facebook.com -r 157.240.221.35/24 #Using facebooks dns dnsrecon -r 157.240.221.35/24 -n 1.1.1.1 #Using cloudflares dns dnsrecon -r 157.240.221.35/24 -n 8.8.8.8 #Using google dns
Inside whois you can find a lot of interesting information like organization name , address , email addresses , phone numbers… But more interestingly, you can find more assets related to the company if you do a reverse whois lookup with any from this field (for example, other whois registries that show the same email address). You can use online tools such as:
https://viewdns.info/reversewhois/ – free
https://domaineye.com/reverse-whois – free
https://www.reversewhois.io/ – free
https://www.whoxy.com/ is a free web, not a free API.
http://reversewhois.domaintools.com/ – not free
https://drs.whoisxmlapi.com/reverse-whois-search – Not free (only 100 free searches)
https://www.domainiq.com/ – not free
You can automate this task with DomLink (requires Whoxy API key). You can also do automatic reverse whois detection with amass :amass intel -d tesla.com -whois
Note that you can use this technique to find more domain names each time you find a new domain.
If you find the same tracker ID on 2 different pages, you can assume that both pages are managed by the same team. For example, if you see the same Google Analytics ID or the same Adsense ID on multiple pages.
There are several pages and tools that allow you to search for these trackers and more:
Did you know that we can find domains and subdomains related to our target by searching for the same hash of the favicon icon? This is exactly what the favihash.py tool created by @m4ll0k2 does. Here’s how to use it:
cat my_targets.txt | xargs -I %% bash -c 'echo "http://%%/favicon.ico"' > targets.txt python3 favihash.py -f https://target/favicon.ico -t targets.txt -s
Simply put, favihash will allow us to discover domains that have the same favicon hash as our target.
Alternatively, you can also search for technologies using the favicon hash, as explained in this blog post . This means that if you know the favicon hash of a vulnerable version of the web technology, you can perform a shodan search and find more vulnerabilities:
shodan search org:"Target" http.favicon.hash:116323821 --fields ip_str,port --separator " " | awk '{print $1":"$2}'
Here’s how you can calculate the hash of a web icon:
import mmh3 import requests import codecs def fav_hash(url): response = requests.get(url) favicon = codecs.encode(response.content,"base64") fhash = mmh3.hash(favicon) print(f"{url} : {fhash}") return fhash
Search within strings for web pages that may be used across different networks within the same organization. A good example is the copyright line. Then search for this line in Google, in other browsers, or even in shodan:
shodan search http.html:"Copyright string"
Usually there is a cron job, for example:
# /etc/crontab 37 13 */10 * * certbot renew --post-hook "systemctl reload nginx"
to update all domain certificates on the server. This means that even if the CA used for this does not set its creation time at runtime, domains belonging to the same company can be found in certificate transparency logs . For more information, see this entry.
Apparently, people usually assign subdomains to IP addresses that belong to cloud providers, and at some point lose that IP address, but forget to delete the DNS record. Therefore, by simply creating a virtual machine in the cloud (for example, Digital Ocean), you will actually own some subdomains.
This post explains the shop and provides a script that creates a virtual machine in DigitalOcean , gets the new machine’s IPv4, and searches Virustotal for subdomain entries pointing to it.
Note that you can use this technique to find more domain names each time you find a new domain.
As you already know the name of the organization that owns the IP space. You can search this data in shodan using: org:”Tesla, Inc.” Check the found hosts for new unexpected domains in the TLS certificate.
You can access the TLS certificate of the main web page, get the organization name , and then search for that name in the TLS certificates of all web pages known to shodan , using the filter: ssl:”Tesla Motors” or use a tool like sslsearch.
Assetfinder is a tool that searches for domains related to the main domain and its subdomains, oddly enough.
Check if the domain is hijacked. Maybe some company is using some domain, but they have lost ownership. Just register it (if cheap enough) and let the company know.
If you find any domain with a different IP address than the ones you already found during asset discovery, you should do a basic vulnerability scan (using Nessus or OpenVAS) and some port scanning with nmap/masscan/shodan . Depending on which services are running, you may find some tricks in this book to “attack” them. Note that sometimes the domain is hosted inside an IP address that is not controlled by the client, so it is out of scope, be careful.
Subdomains
We know all the companies within the scope, all the assets of each company, and all the domains associated with the companies. Now is the time to find all possible subdomains of each domain found.
Let’s try to get subdomains from DNS records. We should also try to pass the zone (if you are vulnerable, you should be notified).
dnsrecon -a -d tesla.com
The fastest way to get a large number of subdomains is to search from external sources. The most used tools are the following (for best results configure API keys):
# subdomains bbot -t tesla.com -f subdomain-enum # subdomains (passive only) bbot -t tesla.com -f subdomain-enum -rf passive # subdomains + port scan + web screenshots bbot -t tesla.com -f subdomain-enum -m naabu gowitness -n my_scan -o .
amass enum [-active] [-ip] -d tesla.com amass enum -d tesla.com | grep tesla.com # To just list subdomains
# Subfinder, use -silent to only have subdomains in the output ./subfinder-linux-amd64 -d tesla.com [-silent]
# findomain, use -silent to only have subdomains in the output ./findomain-linux -t tesla.com [--quiet]
python3 oneforall.py --target tesla.com [--dns False] [--req False] [--brute False] run
assetfinder --subs-only <domain>
# It requires that you create a sudomy.api file with API keys sudomy -d tesla.com
vita -d tesla.com
theHarvester -d tesla.com -b "anubis, baidu, bing, binaryedge, bingapi, bufferoverun, censys, certspotter, crtsh, dnsdumpster, duckduckgo, fullhunt, github-code, google, hackertarget, hunter, intelx, linkedin, linkedin_links, n45ht, omnisint, otx, pentesttools, projectdiscovery, qwant, rapiddns, rocketreach, securityTrails, spyse, sublist3r, threatcrowd, threatminer, trello, twitter, urlscan, virustotal, yahoo, zoomeye"
You can access this data using chaospy or even access the scope used by this project https://github.com/projectdiscovery/chaos-public-program-list
You can find a comparison of many of these tools here: https://blog.blacklanternsecurity.com/p/subdomain-enumeration-tool-face-off
Let’s try to find the DNS servers of the random selection of new subdomains using possible subdomain names. For this step, you’ll need some generic subdomain word lists, such as:
https://gist.github.com/jhaddix/86a06c5dc309d08580a018c66354a056
https://wordlists-cdn.assetnote.io/data/manual/best-dns-wordlist.txt
https://localdomain.pw/subdomain-bruteforce-list/all.txt.zip
https://github.com/danielmiessler/SecLists/tree/master/Discovery/DNS
And also the IP addresses of good DNS resolvers. To create a list of trusted DNS resolvers, you can download them from https://public-dns.info/nameservers-all.txt and use dnsvalidator to filter them. Or you can use: https://raw.githubusercontent.com/trickest/resolvers/main/resolvers-trusted.txt
massdns : This was the first tool to perform efficient DNS brute-force. It is very fast, but it is prone to false positives.
sed 's/$/.domain.com/' subdomains.txt > bf-subdomains.txt ./massdns -r resolvers.txt -w /tmp/results.txt bf-subdomains.txt grep -E "tesla.com. [0-9]+ IN A .+" /tmp/results.txt
gobuster : I think this one only uses 1 resolver
gobuster dns -d mysite.com -t 50 -w subdomains.txt
shuffledns — is a wrapper around massdns written in go that allows you to enumerate valid subdomains using active bruteforce, as well as recognize subdomains using wildcard handling and simple I/O support.
shuffledns -d example.com -list example-subdomains.txt -r resolvers.txt
puredns : It also uses massdns.
puredns bruteforce all.txt domain.com
aiodnsbrute uses asyncio to asynchronously resolve domain names.
aiodnsbrute -r resolvers -w wordlist.txt -vv -t 1024 domain.com
After finding subdomains using open sources and curation, you can create changes to the found subdomains to try to find even more. Several tools are useful for this:
cat subdomains.txt | dnsgen -
goaltdns -l subdomains.txt -w /tmp/words-permutations.txt -o /tmp/final-words-s3.txt
gotator -sub subdomains.txt -silent [-perm /tmp/words-permutations.txt]
altdns -i subdomains.txt -w /tmp/words-permutations.txt -o /tmp/asd3
cat subdomains.txt | dmut -d /tmp/words-permutations.txt -w 100 \ --dns-errorLimit 10 --use-pb --verbose -s /tmp/resolvers-trusted.txt
regulator : for more information read this post , but basically it will take the main parts from the detected subdomains and mix them to find more subdomains.
python3 main.py adobe.com adobe adobe.rules make_brute_list.sh adobe.rules adobe.brute puredns resolve adobe.brute --write adobe.valid
subzuf —is a brute-force subdomain phaser combined with an extremely simple but effective DNS response algorithm. It uses a given set of inputs, such as a custom wordlist or historical DNS/TLS records, to accurately synthesize more matching domain names and expand them further in a loop based on the information gathered during the DNS scan.
echo www | subzuf facebook.com
If you find an IP address that contains one or more web pages belonging to subdomains, you can try to find other subdomains with web pages on that IP by searching OSINT sources for domains in the IP address or by brute forcing domain names VHost at this IP address.
You can find some virtual hosts in IP addresses using HostHunter or other APIs.
If you suspect that some subdomain may be hidden on the web server, you can try to do it the brunt way:
ffuf -c -w /path/to/wordlist -u http://victim.com -H "Host: FUZZ.victim.com" gobuster vhost -u https://mysite.com -t 50 -w subdomains.txt wfuzz -c -w /usr/share/wordlists/SecLists/Discovery/DNS/subdomains-top1million-20000.txt --hc 400,404,403 -H "Host: FUZZ.example.com" -u http://example.com -t 100 #From https://github.com/allyshka/vhostbrute vhostbrute.py --url="example.com" --remoteip="10.1.1.15" --base="www.example.com" --vhosts="vhosts_full.list" #https://github.com/codingo/VHostScan VHostScan -t example.com
You can even access internal/hidden endpoints with this technique.
Sometimes you’ll find pages that only return the Access-Control-Allow-Origin header if the Origin header specifies a valid domain/subdomain . In these scenarios, you can abuse this behavior to open new subdomains.
ffuf -w subdomains-top1million-5000.txt -u http://10.10.10.208 -H 'Origin: http://FUZZ.crossfit.htb' -mr "Access-Control-Allow-Origin" -ignore-body
When searching for subdomains, watch to see if it points to some type of segment , and if so, check the permissions . Also, since at this point you’ll know all the domains inside the scope, try to pick up possible bucket names and check the permissions .
You can monitor whether new subdomains of a domain are created by monitoring the certificate transparency logs routine.
We are looking for vulnerable places
Check possible capture of subdomains. If the subdomain points to a specific S3 bucket, check the permissions.
If you find any subdomain with a different IP address than the ones you already found during asset discovery, you should do a basic vulnerability scan (using Nessus or OpenVAS) and some port scanning with nmap/masscan/shodan . Depending on which services are running, you may find some tricks in this book to “attack” them. Note that sometimes a subdomain is placed inside an IP address that is not controlled by the client, so it is out of scope, be careful.
In the initial stages, you might find some IP address ranges, domains and subdomains. Now is the time to remember all the IP addresses from these ranges and for domains/subdomains (DNS queries).
Using the services from the following free API, you can also find previous IP addresses used by domains and subdomains. These IP addresses may still belong to the customer (and may allow you to find a CloudFlare bypass)
https://securitytrails.com/
You can also check for domains pointing to a specific IP address using the hakip2host tool
We are looking for vulnerable places
Scan all non-CDN port IP addresses (because you probably won’t find anything interesting there). You can find vulnerabilities in detected running services. Find a guide to scanning hosts.
We’ve found all the companies and their assets, and we know the IP address ranges, domains and subdomains within that realm. It’s time to look for web servers.
In the previous steps, you’ve probably already done some checking of discovered IP addresses and domains, so you may have already found all possible web servers. However, if you haven’t, we’re now going to see some quick tricks for finding web servers within an area.
Please note that this will be focused on web application detection, so you should also perform vulnerability and port scans (if scope allows).
Here’s a quick way to find open ports associated with web servers using masscan . Other handy tools for finding web servers are httprobe , fprobe , and httpx . You just pass a list of domains and it will try to connect to port 80 (http) and 443 (https). Alternatively, you can specify to try other ports:
cat /tmp/domains.txt | httprobe #Test all domains inside the file for port 80 and 443 cat /tmp/domains.txt | httprobe -p http:8080 -p https:8443 #Check port 80, 443 and 8080 and 8443
Now that you’ve found all the web servers present in the domain (among the company’s IP addresses and all the domains and subdomains), you probably don’t know where to start. So, let’s keep it simple and start just taking screenshots of all of them. Just by looking at the main page, you can find strange endpoints that are more likely to be vulnerable.
To implement the proposed idea, you can use EyeWitness, HttpScreenshot, Aquatone, Shutter or web screenshot.
What’s more, you can use eyeballer to run through all the screenshots to tell you what likely contains vulnerabilities and what doesn’t.
To find potential cloud resources owned by a company, you should start with a list of keywords that identify that company. For example, crypto for crypto company you can use words like:
"crypto", "wallet", "dao", "<domain_name>", <"subdomain_names">.
You’ll also need lists of common words used in buckets:
https://raw.githubusercontent.com/cujanovic/goaltdns/master/words.txt
https://raw.githubusercontent.com/infosec-au/altdns/master/words.txt
https://raw.githubusercontent.com/jordanpotti/AWSBucketDump/master/BucketNames.txt
Then, with these words, you must generate permutations (check the second round of DNS Brute-Force for more information).
With the resulting word lists, you can use tools like cloud_enum , CloudScraper , cloudlist , or S3Scanner .
Remember, when looking for Cloud Assets, you should be looking for more than just buckets in AWS.
We are looking for vulnerable places
If you find that things like open buckets or cloud features are available, you should access them and try to see what they offer you and if you can exploit them.
With domains and subdomains within a domain, you basically have everything you need to start looking for emails. Here are the APIs and tools that have worked best for me to find company email addresses:
theHarvester – з API
API https://hunter.io/ (безкоштовна версія)
API https://app.snov.io/ (безкоштовна версія)
API https://minelead.io/ (безкоштовна версія)
Emails will come in handy later for brute forcing Internet logins and authentication services (such as SSH). In addition, they are needed for phishing. Additionally, these APIs will give you more information about the person behind the email, which is useful for a phishing campaign.
Credential leak
Using domains, subdomains, and emails, you can start looking for credentials leaked in the past that belong to these emails:
We are looking for vulnerable places
If you find valid leaked credentials, it’s a very easy win.
The leak of credentials is related to the hacking of companies through which confidential information was leaked and sold. However, companies may be affected by other leaks that are not covered by these databases:
Credentials and APIs can be leaked to public company repositories or users working on that company’s github. You can use the Leakos tool to download all public repositories of an organization and its developers and automatically run gitleaks on them.
Leakos can also be used to run gitleaks against all text-provided URLs, since sometimes web pages also contain secrets.
Sometimes criminals or just employees publish company content on a website. This may or may not contain sensitive information, but it’s very interesting to look for. You can use the Pastos tool to search over 80 embed sites at once.
The old but golden fools of Google are always useful for finding open information that shouldn’t be there. The only problem is that the google-hacking-database contains several thousand possible queries that you cannot perform manually. So, you can get your favorite 10 or use a tool like Gorks to run them all.
Note that tools that plan to run the entire database using a regular Google browser will never end because Google will block you very, very soon.
If you find valid credentials or API token leaks, it’s a very easy win.