Basics of foreign intelligence in cyber security

13 March 2024 20 minutes Author: Cyber Witcher

The article describes the methodology of external intelligence, which is a key aspect in the process of pentesting and cyber security. The article details the various steps and techniques that help security researchers and pentesters identify and analyze assets and resources belonging to a target organization or company.

Discovery of assets

So, you’ve been told that everything owned by a company is within the boundaries, and you want to find out what that company actually owns. The goal of this stage is to obtain all the companies belonging to the parent company, and then all the assets of those companies.

For this we are going to:

  • Find the acquisition of the main company, this will give us companies within the area.

  • Find the ASN (if any) of each company, this will give us the IP address ranges belonging to each company

  • Use a reverse whois lookup to find other records (organization names, domains…) related to the first one (this can be done recursively)

  • Use other methods like shodan org and sslfilters to search for other resources (this ssl trick can be done recursively).

Acquisition

First of all, we need to know what other companies belong to the main company. One option is to visit https://www.crunchbase.com/ , find the main company and click “acquire”. There you will see other companies acquired by the main one. Another option is to visit the parent company’s Wikipedia page and search for acquisitions.

Okay, at this point you should know all the companies that are in scope. Let’s find out how to find their assets.

ASN

An Autonomous System Number (ASN) is a unique number assigned to an Autonomous System (AS) by the Internet Numbering Authority (IANA). AS consists of blocks of IP addresses that have a clearly defined access policy to external networks and are administered by one organization, but may consist of several operators.

It is interesting to know if the company has assigned any ASN to lookup its IP address ranges. It will be interesting to perform a vulnerability test on all hosts within the scope and look for domains in those IP addresses. You can search by company name, IP address or domain on the page https://bgp.he.net/. Depending on the company’s region, these links may be useful for collecting additional data: AFRINIC (Africa), Arin (North America), APNIC (Asia), LACNIC (Latin America), RIPE NCC (Europe). In any case, probably all the useful information (IP ranges and Whois) appears already in the first link.

#You can try "automate" this with amass, but it's not very recommended
amass intel -org tesla
amass intel -asn 8911,50313,394161

In addition, the BBOT subdomain list automatically aggregates and summarizes ASNs at the end of the scan.

bbot -t tesla.com -f subdomain-enum
...
[INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+
[INFO] bbot.modules.asn: | AS394161 | 8.244.131.0/24      | 5            | TESLA          | Tesla Motors, Inc.         | US        |
[INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+
[INFO] bbot.modules.asn: | AS16509  | 54.148.0.0/15       | 4            | AMAZON-02      | Amazon.com, Inc.           | US        |
[INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+
[INFO] bbot.modules.asn: | AS394161 | 8.45.124.0/24       | 3            | TESLA          | Tesla Motors, Inc.         | US        |
[INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+
[INFO] bbot.modules.asn: | AS3356   | 8.32.0.0/12         | 1            | LEVEL3         | Level 3 Parent, LLC        | US        |
[INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+
[INFO] bbot.modules.asn: | AS3356   | 8.0.0.0/9           | 1            | LEVEL3         | Level 3 Parent, LLC        | US        |
[INFO] bbot.modules.asn: +----------+---------------------+--------------+----------------+----------------------------+-----------+

Organization IP address ranges can also be found using http://asnlookup.com/ (it has a free API). You can determine the IP and ASN of a domain using http://ipv4info.com/.

We are looking for vulnerable places

At this point we know all the assets within the domain , so if you are allowed, you can run some kind of vulnerability scanner (Nessus, OpenVAS) on all hosts. Alternatively, you can run a few port scans or use services like shodan to find open ports , and depending on what you find, you should take a look at this book to check a few possible services running. Also, it might be worth mentioning that you can also prepare some lists of default usernames and passwords and try to bruteforce the services using https://github.com/x90skysn3k/brutespray .

Domains

We know all the in-scope companies and their assets, it’s time to find in-scope domains.

Please note that you can also find subdomains in the methods below, and this information should not be underestimated.

First of all, you should find the main domain(s) of each company. For example, for Tesla Inc. will be tesla.com.

Reverse DNS

Now that you’ve found all the domains’ IP address ranges, you can try doing a reverse DNS lookup on those IP addresses to find more domains within . Try to use some victim dns server or some known dns server (1.1.1.1, 8.8.8.8)

dnsrecon -r <DNS Range> -n <IP_DNS>   #DNS reverse of all of the addresses
dnsrecon -d facebook.com -r 157.240.221.35/24 #Using facebooks dns
dnsrecon -r 157.240.221.35/24 -n 1.1.1.1 #Using cloudflares dns
dnsrecon -r 157.240.221.35/24 -n 8.8.8.8 #Using google dns

Reverse Whois (loop)

Inside whois you can find a lot of interesting information like organization name , address , email addresses , phone numbers… But more interestingly, you can find more assets related to the company if you do a reverse whois lookup with any from this field (for example, other whois registries that show the same email address). You can use online tools such as:

  • https://viewdns.info/reversewhois/ – free

  • https://domaineye.com/reverse-whois – free

  • https://www.reversewhois.io/ – free

  • https://www.whoxy.com/ is a free web, not a free API.

  • http://reversewhois.domaintools.com/ – not free

  • https://drs.whoisxmlapi.com/reverse-whois-search – Not free (only 100 free searches)

  • https://www.domainiq.com/ – not free

You can automate this task with DomLink (requires Whoxy API key). You can also do automatic reverse whois detection with amass :amass intel -d tesla.com -whois

Note that you can use this technique to find more domain names each time you find a new domain.

Trackers

If you find the same tracker ID on 2 different pages, you can assume that both pages are managed by the same team. For example, if you see the same Google Analytics ID or the same Adsense ID on multiple pages.

There are several pages and tools that allow you to search for these trackers and more:

Favicon

Did you know that we can find domains and subdomains related to our target by searching for the same hash of the favicon icon? This is exactly what the favihash.py tool created by @m4ll0k2 does. Here’s how to use it:

cat my_targets.txt | xargs -I %% bash -c 'echo "http://%%/favicon.ico"' > targets.txt
python3 favihash.py -f https://target/favicon.ico -t targets.txt -s

Simply put, favihash will allow us to discover domains that have the same favicon hash as our target.

Alternatively, you can also search for technologies using the favicon hash, as explained in this blog post . This means that if you know the favicon hash of a vulnerable version of the web technology, you can perform a shodan search and find more vulnerabilities:

shodan search org:"Target" http.favicon.hash:116323821 --fields ip_str,port --separator " " | awk '{print $1":"$2}'

Here’s how you can calculate the hash of a web icon:

import mmh3
import requests
import codecs

def fav_hash(url):
    response = requests.get(url)
    favicon = codecs.encode(response.content,"base64")
    fhash = mmh3.hash(favicon)
    print(f"{url} : {fhash}")
    return fhash

Copyright / unique string

Search within strings for web pages that may be used across different networks within the same organization. A good example is the copyright line. Then search for this line in Google, in other browsers, or even in shodan:

shodan search http.html:"Copyright string"

CRT time

Usually there is a cron job, for example:

# /etc/crontab
37 13 */10 * * certbot renew --post-hook "systemctl reload nginx"

to update all domain certificates on the server. This means that even if the CA used for this does not set its creation time at runtime, domains belonging to the same company can be found in certificate transparency logs . For more information, see this entry.

Passive absorption

Apparently, people usually assign subdomains to IP addresses that belong to cloud providers, and at some point lose that IP address, but forget to delete the DNS record. Therefore, by simply creating a virtual machine in the cloud (for example, Digital Ocean), you will actually own some subdomains.

This post explains the shop and provides a script that creates a virtual machine in DigitalOcean , gets the new machine’s IPv4, and searches Virustotal for subdomain entries pointing to it.

Other ways

Note that you can use this technique to find more domain names each time you find a new domain.

Shodan

As you already know the name of the organization that owns the IP space. You can search this data in shodan using: org:”Tesla, Inc.” Check the found hosts for new unexpected domains in the TLS certificate.

You can access the TLS certificate of the main web page, get the organization name , and then search for that name in the TLS certificates of all web pages known to shodan , using the filter: ssl:”Tesla Motors” or use a tool like sslsearch.

Assetfinder

Assetfinder is a tool that searches for domains related to the main domain and its subdomains, oddly enough.

We are looking for vulnerable places

Check if the domain is hijacked. Maybe some company is using some domain, but they have lost ownership. Just register it (if cheap enough) and let the company know.

If you find any domain with a different IP address than the ones you already found during asset discovery, you should do a basic vulnerability scan (using Nessus or OpenVAS) and some port scanning with nmap/masscan/shodan . Depending on which services are running, you may find some tricks in this book to “attack” them. Note that sometimes the domain is hosted inside an IP address that is not controlled by the client, so it is out of scope, be careful.

Subdomains

We know all the companies within the scope, all the assets of each company, and all the domains associated with the companies. Now is the time to find all possible subdomains of each domain found.

DNS

Let’s try to get subdomains from DNS records. We should also try to pass the zone (if you are vulnerable, you should be notified).

dnsrecon -a -d tesla.com

OSINT

The fastest way to get a large number of subdomains is to search from external sources. The most used tools are the following (for best results configure API keys):

BBOT

# subdomains
bbot -t tesla.com -f subdomain-enum

# subdomains (passive only)
bbot -t tesla.com -f subdomain-enum -rf passive

# subdomains + port scan + web screenshots
bbot -t tesla.com -f subdomain-enum -m naabu gowitness -n my_scan -o .

Amass

amass enum [-active] [-ip] -d tesla.com
amass enum -d tesla.com | grep tesla.com # To just list subdomains

subfinder

# Subfinder, use -silent to only have subdomains in the output
./subfinder-linux-amd64 -d tesla.com [-silent]

findomain

# findomain, use -silent to only have subdomains in the output
./findomain-linux -t tesla.com [--quiet]

OneForAll

python3 oneforall.py --target tesla.com [--dns False] [--req False] [--brute False] run

assetfinder

assetfinder --subs-only <domain>

Sudomy

# It requires that you create a sudomy.api file with API keys
sudomy -d tesla.com

vita

vita -d tesla.com

theHarvester

theHarvester -d tesla.com -b "anubis, baidu, bing, binaryedge, bingapi, bufferoverun, censys, certspotter, crtsh, dnsdumpster, duckduckgo, fullhunt, github-code, google, hackertarget, hunter, intelx, linkedin, linkedin_links, n45ht, omnisint, otx, pentesttools, projectdiscovery, qwant, rapiddns, rocketreach, securityTrails, spyse, sublist3r, threatcrowd, threatminer, trello, twitter, urlscan, virustotal, yahoo, zoomeye"

You can access this data using chaospy or even access the scope used by this project https://github.com/projectdiscovery/chaos-public-program-list

You can find a comparison of many of these tools here: https://blog.blacklanternsecurity.com/p/subdomain-enumeration-tool-face-off

DNS Brute force

Let’s try to find the DNS servers of the random selection of new subdomains using possible subdomain names. For this step, you’ll need some generic subdomain word lists, such as:

And also the IP addresses of good DNS resolvers. To create a list of trusted DNS resolvers, you can download them from https://public-dns.info/nameservers-all.txt and use dnsvalidator to filter them. Or you can use: https://raw.githubusercontent.com/trickest/resolvers/main/resolvers-trusted.txt

The most recommended DNS resolution tools are:

massdns : This was the first tool to perform efficient DNS brute-force. It is very fast, but it is prone to false positives.

sed 's/$/.domain.com/' subdomains.txt > bf-subdomains.txt
./massdns -r resolvers.txt -w /tmp/results.txt bf-subdomains.txt
grep -E "tesla.com. [0-9]+ IN A .+" /tmp/results.txt

gobuster : I think this one only uses 1 resolver

gobuster dns -d mysite.com -t 50 -w subdomains.txt

shuffledns — is a wrapper around massdns written in go that allows you to enumerate valid subdomains using active bruteforce, as well as recognize subdomains using wildcard handling and simple I/O support.

shuffledns -d example.com -list example-subdomains.txt -r resolvers.txt

puredns : It also uses massdns.

puredns bruteforce all.txt domain.com

aiodnsbrute uses asyncio to asynchronously resolve domain names.

aiodnsbrute -r resolvers -w wordlist.txt -vv -t 1024 domain.com

Second round of DNS brute force

After finding subdomains using open sources and curation, you can create changes to the found subdomains to try to find even more. Several tools are useful for this:

dnsgen

cat subdomains.txt | dnsgen -

goaltdns

goaltdns -l subdomains.txt -w /tmp/words-permutations.txt -o /tmp/final-words-s3.txt

gotator

gotator -sub subdomains.txt -silent [-perm /tmp/words-permutations.txt]

altdns

altdns -i subdomains.txt -w /tmp/words-permutations.txt -o /tmp/asd3

dmut

cat subdomains.txt | dmut -d /tmp/words-permutations.txt -w 100 \
    --dns-errorLimit 10 --use-pb --verbose -s /tmp/resolvers-trusted.txt

Generation of smart permutations

regulator : for more information read this post , but basically it will take the main parts from the detected subdomains and mix them to find more subdomains.

python3 main.py adobe.com adobe adobe.rules
make_brute_list.sh adobe.rules adobe.brute
puredns resolve adobe.brute --write adobe.valid

subzuf —is a brute-force subdomain phaser combined with an extremely simple but effective DNS response algorithm. It uses a given set of inputs, such as a custom wordlist or historical DNS/TLS records, to accurately synthesize more matching domain names and expand them further in a loop based on the information gathered during the DNS scan.

echo www | subzuf facebook.com

VHosts / virtual hosts

If you find an IP address that contains one or more web pages belonging to subdomains, you can try to find other subdomains with web pages on that IP by searching OSINT sources for domains in the IP address or by brute forcing domain names VHost at this IP address.

OSINT

You can find some virtual hosts in IP addresses using HostHunter or other APIs.

Brute force

If you suspect that some subdomain may be hidden on the web server, you can try to do it the brunt way:

ffuf -c -w /path/to/wordlist -u http://victim.com -H "Host: FUZZ.victim.com"

gobuster vhost -u https://mysite.com -t 50 -w subdomains.txt

wfuzz -c -w /usr/share/wordlists/SecLists/Discovery/DNS/subdomains-top1million-20000.txt --hc 400,404,403 -H "Host: FUZZ.example.com" -u http://example.com -t 100

#From https://github.com/allyshka/vhostbrute
vhostbrute.py --url="example.com" --remoteip="10.1.1.15" --base="www.example.com" --vhosts="vhosts_full.list"

#https://github.com/codingo/VHostScan
VHostScan -t example.com

You can even access internal/hidden endpoints with this technique.

CORS brute force

Sometimes you’ll find pages that only return the Access-Control-Allow-Origin header if the Origin header specifies a valid domain/subdomain . In these scenarios, you can abuse this behavior to open new subdomains.

ffuf -w subdomains-top1million-5000.txt -u http://10.10.10.208 -H 'Origin: http://FUZZ.crossfit.htb' -mr "Access-Control-Allow-Origin" -ignore-body

Brute Force buckets

When searching for subdomains, watch to see if it points to some type of segment , and if so, check the permissions . Also, since at this point you’ll know all the domains inside the scope, try to pick up possible bucket names and check the permissions .

Monitoring

You can monitor whether new subdomains of a domain are created by monitoring the certificate transparency logs routine.

We are looking for vulnerable places

Check possible capture of subdomains. If the subdomain points to a specific S3 bucket, check the permissions.

If you find any subdomain with a different IP address than the ones you already found during asset discovery, you should do a basic vulnerability scan (using Nessus or OpenVAS) and some port scanning with nmap/masscan/shodan . Depending on which services are running, you may find some tricks in this book to “attack” them. Note that sometimes a subdomain is placed inside an IP address that is not controlled by the client, so it is out of scope, be careful.

IP addresses

In the initial stages, you might find some IP address ranges, domains and subdomains. Now is the time to remember all the IP addresses from these ranges and for domains/subdomains (DNS queries).

Using the services from the following free API, you can also find previous IP addresses used by domains and subdomains. These IP addresses may still belong to the customer (and may allow you to find a CloudFlare bypass)

  • https://securitytrails.com/

You can also check for domains pointing to a specific IP address using the hakip2host tool

We are looking for vulnerable places

Scan all non-CDN port IP addresses (because you probably won’t find anything interesting there). You can find vulnerabilities in detected running services. Find a guide to scanning hosts.

Hunting for web servers

We’ve found all the companies and their assets, and we know the IP address ranges, domains and subdomains within that realm. It’s time to look for web servers.

In the previous steps, you’ve probably already done some checking of discovered IP addresses and domains, so you may have already found all possible web servers. However, if you haven’t, we’re now going to see some quick tricks for finding web servers within an area.

Please note that this will be focused on web application detection, so you should also perform vulnerability and port scans (if scope allows).

Here’s a quick way to find open ports associated with web servers using masscan . Other handy tools for finding web servers are httprobe , fprobe , and httpx . You just pass a list of domains and it will try to connect to port 80 (http) and 443 (https). Alternatively, you can specify to try other ports:

cat /tmp/domains.txt | httprobe #Test all domains inside the file for port 80 and 443
cat /tmp/domains.txt | httprobe -p http:8080 -p https:8443 #Check port 80, 443 and 8080 and 8443

Screenshots

Now that you’ve found all the web servers present in the domain (among the company’s IP addresses and all the domains and subdomains), you probably don’t know where to start. So, let’s keep it simple and start just taking screenshots of all of them. Just by looking at the main page, you can find strange endpoints that are more likely to be vulnerable.

To implement the proposed idea, you can use EyeWitness, HttpScreenshot, Aquatone, Shutter or web screenshot.

What’s more, you can use eyeballer to run through all the screenshots to tell you what likely contains vulnerabilities and what doesn’t.

Public cloud resources

To find potential cloud resources owned by a company, you should start with a list of keywords that identify that company. For example, crypto for crypto company you can use words like:

"crypto", "wallet", "dao", "<domain_name>", <"subdomain_names">.

You’ll also need lists of common words used in buckets:

Then, with these words, you must generate permutations (check the second round of DNS Brute-Force for more information).

With the resulting word lists, you can use tools like cloud_enum , CloudScraper , cloudlist , or S3Scanner .

Remember, when looking for Cloud Assets, you should be looking for more than just buckets in AWS.

We are looking for vulnerable places

If you find that things like open buckets or cloud features are available, you should access them and try to see what they offer you and if you can exploit them.

Emails

With domains and subdomains within a domain, you basically have everything you need to start looking for emails. Here are the APIs and tools that have worked best for me to find company email addresses:

We are looking for vulnerable places

Emails will come in handy later for brute forcing Internet logins and authentication services (such as SSH). In addition, they are needed for phishing. Additionally, these APIs will give you more information about the person behind the email, which is useful for a phishing campaign.

Credential leak

Using domains, subdomains, and emails, you can start looking for credentials leaked in the past that belong to these emails:

We are looking for vulnerable places

If you find valid leaked credentials, it’s a very easy win.

Leaking secrets

The leak of credentials is related to the hacking of companies through which confidential information was leaked and sold. However, companies may be affected by other leaks that are not covered by these databases:

Github sources

Credentials and APIs can be leaked to public company repositories or users working on that company’s github. You can use the Leakos tool to download all public repositories of an organization and its developers and automatically run gitleaks on them.

Leakos can also be used to run gitleaks against all text-provided URLs, since sometimes web pages also contain secrets.

Inserts origins

Sometimes criminals or just employees publish company content on a website. This may or may not contain sensitive information, but it’s very interesting to look for. You can use the Pastos tool to search over 80 embed sites at once.

Google Dorks

The old but golden fools of Google are always useful for finding open information that shouldn’t be there. The only problem is that the google-hacking-database contains several thousand possible queries that you cannot perform manually. So, you can get your favorite 10 or use a tool like Gorks to run them all.

Note that tools that plan to run the entire database using a regular Google browser will never end because Google will block you very, very soon.

We are looking for vulnerable places

If you find valid credentials or API token leaks, it’s a very easy win.

Other related articles
Found an error?
If you find an error, take a screenshot and send it to the bot.