This lab showcases the ability to gather, analyze, and interpret publicly available data to uncover actionable insights. Leveraging OSINT techniques, explore digital footprints across social media, websites, public records, and metadata to simulate real-world reconnaissance scenarios. The project emphasizes ethical data collection, threat profiling, and vulnerability assessment, all without breaching privacy or security.
Cybrary is a well established and free IT training platform with several intuitive labs to explore
A paid subscription with more advanced labs is available as well outside the scope of this platform
Head to https://www.cybrary.it to create a free account for learning available on their platform
Head to OSINT to complete this training lab for yourself or perform on your homelab below
Requirements:
 • Windows PC w/ Internet Connection
 • USB Flash Drive w/ at least 8GB Capacity
 • Second PC with at least 2 GB of memory and 2 CPU cores
1. Create Kali Live USB
Kali Linux is an Open-Source Linux distribution which comes bundled with penetration testing tools
The OS is based on Debian and includes tools for network, social engineering and cracking attacks
You can use the operating system without having to install it onto your hard drive with a live usb
Download Kali Linux Live ISO: Kali Live Boot Official
Download Rufus Disk Imaging Software: Rufus Official
Insert USB Flash Drive, run rufus.exe, select target drive, select Kali Live Iso, start:
Remove USB Flash Drive and Insert into unused PC. Start PC and press hotboot key on startup:
Select UEFI USB Flash Boot. Scroll to select Live System with USB Persistence to load the desktop:
This live system won't save anything on your PC's hard drive, and we will complete the lab here
2. Sync Time and Update Sources
Kali Live unfortunately often comes with a broken APT configuration out of the box, let's fix this
From the Kali Desktop Environment right click the background and left click Open Terminal Here:
Run the following command from the Kali Live Terminal to witness the broken APT configuration:
(kali@kali)-[~/Desktop]
$ sudo apt-get update
Resulting Output:
The key piece of information is the 'Not live until 2025-10-14T23:39:53Z' indicating time errors
This is caused because the Kali Live version does not have it's time zone synced out of the box
Run the following commands from the Kali Live Terminal to syncronize the systems time settings:
(kali@kali)-[~/Desktop]
$ sudo timedatectl set-ntp true
Run the following command from the Kali Live Terminal to test our solution to unsyncronized time:
(kali@kali)-[~/Desktop]
$ sudo apt-get update
Resulting Output:
Our error message was reduced in size but not eliminated completely. We now have another error
The 'Conflicting Distribution' error is caused by an improperly formatted sources.list file
Run the following command from the Kali Live Terminal to edit the contents of our sources.list:
(kali@kali)-[~/Desktop]
$ sudo nano /etc/apt/sources.list
The Kali Live system by default includes the latest snapshot in a file rather than a proper link
Make changes to the first line of the file and add a second line so that sources.list reads as:
deb http://http.kali.org/kali kali-last-snapshot main contrib non-free non-free-firmware
deb-src http://http.kali.org/kali kali-last-snapshot main contrib non-free non-free-firmware
deb http://http.kali.org/kali kali-rolling main contrib non-free non-free-firmware
deb-src http://http.kali.org/kali kali-rolling main contrib non-free non-free-firmware
Save the changes in the nano text editor with CTRL+O and exit the program with CTRL+X
We now have a properly functioning APT package manager within the Kali Live system to utilize
3. OSINT Sources
Open-Source Intelligence or OSINT encompasses all of the information that is publicly accessible
This includes Websites, Print Media, and Locations. Weather paid or freely available for access
Reconnaisance refers to analyzing information to create actionable insights on a specific target
OSINT is a passive form of reconnaisance whereby information is gathered without direct contact
Open-Source Intelligence comes in many different forms but generally falls into two categories:
Open-Source Intelligence can be ontained from many different types of publicly available resources:
4. Websites
For this lab our target will be a cybersecurity company called Rekt Systems with a public domain
We start by looking for surface level information to begin to build a framework around our target
Once we have some basic identifiers we can properly assess and correlate information we will find
Head to your browser and enter the given domain rekt.systems into the URL address bar:
Let's explore the pages HTML Sitesmap - The collective clickable links on the website accessible
Take note of any subdomains you can find and any indication of the stack or technology types used
Upon exploring each of the navicable links from the homepage we find a subdomain as login.rekt.systems:
Let's head to the Careers page to search for information regarding their specific technology stack:
The job description for a Software Engineer role is very likely to contain experience qualifications
From here click on the "Apply Here" hyperlink located on the Careers page to search for this info:
Here we can see the full job description for the Software Developer role including technology stack
This tells us what systems are used within the environment and what types of software run on them
Take note of the line listing phpMyAdmin version 4.8.1. Specific versions are useful for exploiting
Python is a multi purpose general programming and scripting language which includes many modules
The BrowserHistory module can be used to export web browser history into usable CSV text files
This is very usefull for reconnaissance activities involving websites to map the HTML sitemaps
Here we will walk through the process of installing this tool and starting the service it uses
Run the following command from the Kali Live Terminal to update and upgrade existing packages:
(kali@kali)-[~/Desktop]
$ sudo apt-get update && sudo apt-get upgrade -y
This may take some time to fully upgrade each package, you will be presented with prompts as well
When presented with the Dumpcap and Wireshark prompt press Enter to select the given 'NO' option:
PIP is the official package manager for Python, which streamlines installation process for modules
Run the following command from the Kali Live Terminal to install the Python PIP package manager:
(kali@kali)-[~/Desktop]
$ sudo apt-get install python3-pip
Python will only install modules from the supported environment, in this case the Kali Linux Repo
In order for us to install the browserhistory module we must do so in a virtualized environment
Run the following command from the Kali Live Terminal to install the pipx environment extension:
(kali@kali)-[~/Desktop]
$ sudo apt-get install pipx
The pip command can be used within the terminal to easily install modules and libraries supported
Run the following command from the Kali Live Terminal to install the browserhistory module with pip:
(kali@kali)-[~/Desktop]
$ pipx install browser-history
We will now utilize the previously installed browser-history to export a CSV file of the sitemap
Run the following commands from the Kali Live Terminal to capture and export the browser history:
(kali@kali)-[~/Desktop]
$ browser-history -b Firefox -o browserhistory.csv
Let's check that the operation was successfull by using the ls command from the Kali Live Terminal:
(kali@kali)-[~/Desktop]
$ ls
Resulting Output:
browserhistory.csv Documents Music Public Videos
Desktop Downloads Pictures Templates
Success, now run the following command from the Kali Live Terminal to view the contents of our file:
(kali@kali)-[~/Desktop]
$ cat browserhistory.csv
Resulting Output:
2025-10-18 01:57:55+00:00,https://rekt.systems/index.html,RektSystems | Home
2025-10-18 01:58:24+00:00,https://rekt.systems/blog.html,RektSystems | Blog
2025-10-18 01:58:29+00:00,https://rekt.systems/blog2.html,RektSystems | Blog
2025-10-18 01:58:35+00:00,https://rekt.systems/blog3.html,RektSystems | Blog
2025-10-18 01:58:41+00:00,https://rekt.systems/blog4.html,RektSystems | Blog
2025-10-18 01:58:48+00:00,https://rekt.systems/products.html,RektSystems | Products
2025-10-18 01:58:56+00:00,https://rekt.systems/services.html,RektSystems | Services
2025-10-18 01:59:01+00:00,https://rekt.systems/careers.html,RektSystems | Careers
2025-10-18 01:59:05+00:00,https://rekt.systems/software-engineer.html,RektSystems | Careers
2025-10-18 01:59:07+00:00,https://rekt.systems/support.html,RektSystems | Support
2025-10-18 02:15:00+00:00,https://login.rekt.systems/login.html,RektSystems | Login
Now we have a solid foundation for the companies website including all accessible parts of the sitemap
For us to actual make use of this information we need to seperate the URLs fro the other information
Run the following command fro mthe Kali Live Terminal to search for URLs and send them to a new file:
(kali@kali)-[~/Desktop]
$ grep -oP 'https?://[^ ]+\.html\b' browserhistory.csv | sort -u > url.lst
This will not only take out the url specifically but also ensure there are no duplicate url entries
Use the ls and cat command from the Kali Live Terminal to view the contents of our new file:
(kali@kali)-[~/Desktop]
$ cat url.lst
Resulting Output:
https://rekt,systems/index.html
https://rekt.systems/blog.html
https://rekt.systems/blog2.html
https://rekt.systems/blog3.html
https://rekt.systems/blog4.html
https://rekt.systems/products.html
https://rekt.systems/services.html
https://rekt.systems/careers.html
https://rekt.systems/software-engineering.html
https://rekt.systems/support.html
https://login.rekt.systems/login.html
This is a success, we have a list of pages on the target sites we can use to perform potential attacks
Websites often contain a combination of HTML, CSS and Javascript code to serve content dynamically
There is a ton of useful information we can gain about the target by viewing their sites source code
We can use a command line tool called getJS to extract the Javascript source code from the web pages
Run the following command from the Kali Live Terminal to install the go package manager used by getJS:
(kali@kali)-[~/Desktop]
$ sudo apt-get install golang
Run the following command from the Kali Live Terminal to install the getJS command with go:
(kali@kali)-[~/Desktop]
$ go install github.com/003random/getJS/v2@latest
Now the getJS packjages are installed onto the system in go/bin/getJS but we must move them to $PATH
Run the following command from the Kali Live terminal to enable the getJS command to be used in the cli:
(kali@kali)-[~/Desktop]
$ sudo cp go/bin/getJS /usr/bin
Run the following command from the Kali Live Terminal to extract all the associated javascript files:
(kali@kali)-[~/Desktop]
$ getJS --input url.lst --complete | xargs wget
Resulting Output:
We have received a hit on the login.rekt.systems subdomain for a referenced javascript file in the code
Run the following commands from the Kali Live Terminal to extract any urls within the javascript file:
(kali@kali)-[~/Desktop]
$ grep -Eo 'https?://[a-zA-Z0-9./?=_-]*' main.js | sort -u >> url.lst
Run the following command from the Kali Live Terminal to check our list for any new entries:
(kali@kali)-[~/Desktop]
$ cat url.lst
Resulting Output:
https://rekt,systems/index.html
https://rekt.systems/blog.html
https://rekt.systems/blog2.html
https://rekt.systems/blog3.html
https://rekt.systems/blog4.html
https://rekt.systems/products.html
https://rekt.systems/services.html
https://rekt.systems/careers.html
https://rekt.systems/software-engineering.html
https://rekt.systems/support.html
https://login.rekt.systems/login.html
https://api.rekt.systems/get/user
We have located a new 'api' subdomain for our target. Let's switch things up and view from another angle
5. Search Engines
Search engines are one of the most powerful tools available for Open Source Intelligence Reconnaissance
They organize publicly avilable web servers which include content the user is searching for on pages
However that is not the only way they can sort web servers, engines also include many search operators:
 • site:example.com - limit results to any ending in example.com
 • inurl:term - limit results to URLs containing term
 • intitle:term - restrict results to page titles containing term
 • intext:term - restrict results to any page text containing term
 • filetype:pdf - restrict results to PDF files
 • ext:pdf - restrict results to files ending in .pdf
You can use these more advanced operators to careful tailor your search results and find information
The use of these operators in reconnaissance is known as 'Search Engine Hacking' and is very powerful
Head by to Firefox and head to IntelTechniques which provides useful tools for OSINT reconnaissance:
From the Intel Techniques website navigate to Tools > Search Engines, allowing us to use many engines
In the text box next to populate all type in 'site:rekt.systems filetype:xml' and click populate all:
Click on the 'Google' button and navigate to the only result to view the websites XML sitemap file:
Look through the XML sitemap page for any links we missed in our website reconnaissance. Notice a PDF
Files are a great source of Open Source Intelligence as they may contain metadata about the systems:
Let's download this file and see what new information about the target we can gain from examining it
Run the following command from the Kali Live Terminal to download our new file from rekt.systems:
(kali@kali)-[~/Desktop]
$ wget https://rekt.systems/xmas-flyer.pdf
Resulting Output:
Kali Linux comes bundled with exiftool which is a command line utility which can print file metadata
Run the following command from the Kali Live Terminal to extract the metadata from our downloaded PDF:
(kali@kali)-[~/Desktop]
$ exiftool xmas-flyer.pdf
Many pieces of information are displayed from this commands output. Most useful to take note of:
 • Author: DKinney
 • Producer: Xerox AltaLink C8055
Let's do a quick google search to look for vulnerabilities in the discovered printing device used
We can easily find CVE-2019-10881 for the device which we could save to use in the exploitation phase
6. Public Records
Another great source for intelligence, public records are accessible to anyone and can be used here
Public Records are information which much be made publicly accessible for different legal purposes
For example the owner of a domain must be publicly available in case they try to host illegal content
From the Firefox Web Browser head to WHOXY and search for rekt.systems to reveal ownership information:
The client's privacy settings prevent us from seeing any contact information, but we can other info:
 • Domain Registrar
 • Country Code
 • Name Servers
Next we will take a look at the SSL/TLS certificates that have been inssued for this domain by CA's
Certificates are used to prove the servers identity and establish an encrypted channel for data
Certificates are captured in a public Certificate Tranparency Log for all associated subdomains
Viewing this record can provides us with a convenient method for discovering additional subdomains
Run the following commands from the Kali Live Terminal to install the ctfr command line tool:
(kali@kali)-[~/Desktop]
$ git clone https://github.com/UnaPibaGeek/ctfr.git
(kali@kali)-[~/Desktop]
$ cd ctfr
Run the following commands from the Kali Live Terminal to enumerate the subdomains with ctfr.py:
(kali@kali)-[~/ctfr]
$ python3 ctfr.py -d rekt.systems -o ~/subdomains.txt
(kali@kali)-[~/ctfr]
$ cd ~
Use the cat command from the Kali Live Terminal to view subdomains we discovered certificates for:
(kali@kali)-[~/Desktop]
$ cat subdomains.txt
Resulting Output:
*.rekt.systems
rekt.systems
jira.rekt.systems
login.rekt.systems
vpn.rekt.systems
Keep in mind that these are subdomains with certificates, they may not actually resolve to anything
There could also be additional subdomains which resolve but do not carry any active certificates
7. CTI Platforms & Web Archives
Web Archives are historical databases documenting the internet and websites as they were over time
These can be used to see previously associated pages for our target even if they no longer exist
We will be using the command line tool GAU (Get All URLs) to probe multiple Web Archives for info
Run the following command from the Kali Live Terminal to install GAU with the go package manager:
(kali@kali)-[~/Desktop]
$ go install github.com/lc/gau/v2/cmd/gau@latest
Now the gau packages are installed onto the system in go/bin/gau but we must move them to $PATH
Run the following command from the Kali Live terminal to enable the gau command to be used in the cli:
(kali@kali)-[~/Desktop]
$ sudo cp go/bin/gau /usr/bin
Run the following command from the Kali Live Terminal to probe multiple Web Archives for URLs:
(kali@kali)-[~/Desktop]
$ gau rekt.systems
Resulting Output:
Bingo, now we have a ton of files to work from. Take note of the invite_token session key URL
We can attempt to use this session key to hijack the session and impersonate a legitimate user
From the Kali Live Terminal, right click the link with invite_token and hit the open link option:
Congratulations, you have successfully performed a session hijacking attack during reconnaissance
8. Social Platforms
Social Media Platforms can be used to gather open source intelligence on the people with our target
The term 'Lurking' is a good summary, viewing and searching for peoples profiles wihtout interacting
LinkedIn is a social platform used for work and professional related networking and public sharing
Head to Firefox and type in "site:linkedin.com intext:rektsystems" to search linkedin for our target:
Doug Kinney seems very similar to the DKINNEY author listed for the xmas PDF file discovered before
Click on the profile link for the companies CEO and likely PDF file author to discover his email:
Great job, without leaving a single trace on any systems, we've discovered the CEO's email address
This concludes our lab on Open Source Intelligence Reconnaissance, you now have the tools necessary
Check back for future labs where we go beyond this and enter the exploitation and documentation phases