Dark Web Monitoring Tool

Dark Web Monitoring Tool

Dark Web Monitoring Tool

Dark Web Monitoring Tool

Client:

During my work period at Axis

Client:

During my work period at Axis

Client:

During my work period at Axis

Duration:

3 Months

Duration:

3 Months

Duration:

3 Months

Threat intelligence

SIEM

Python Script

TeamWave Website in Framer

Nowadays, every company should know where their data is, who can access it, and if anyone has a free trip with it without their knowledge. 


The dark web is a great place for APTs and hacker groups to announce their attacks proudly.


Cybersecurity companies took advantage of this behavior and created a tool that monitors the data and information of IT companies to see if they have been breached or not. 



This tool is called Dark Web Monitoring.



If the tool detects the data breach, the IT company has time to minimize the impact of the incidents early and minimize the losses. This is normal because no system is completely secure, and there's no guarantee that any solution will prevent all cyber attacks.


In my tenure at AXIS as a fintech company, we have seen the urgency of having this tool to detect any breach behavior that could happen to our customer data.


Instead of buying dark web monitoring, I was assigned to do it from scratch!


In this technical blog, we will learn how we created this tool using Python and ELK SIEM with GOD's favor and guidance. Also, with the help of my great team and the support of leadership.


Architecture Diagram

figure 1: Dark web Architecture Diagram


The logic and the sequence of this tool will be the next:


1. Create a Python script that takes specific keywords that you want to monitor [e-mails, phone numbers, company domains, etc.]


2. The Python script will take these keywords and search for them in three main sources:

  • Dark web search engines 

  • APTs Forums

  • APTs telegram group


3: The Python script will take the output of these sources and save it to log or JSON files.


4: Filebeat will send these files to Elastic Search to index and parse them, and then to Kibana for monitoring the data in a beautiful dashboard created by me 😎

figure 2: Final Dashboard


(Note: This is not real data. It is just for testing purposes.)


easy beze.. right! Let's go more deep.

Technical Details

First, I imported these libraries, and each one of them have a specific role in our script as we see in figure3.

Figure3: libraries


Figure 4: The first lines of code

  • The first part is about creating folders so we be able to save the logs in an organized way as we see in Figure 5 below.

Figure 5: Organized Folders To save the logs

  • The purpose of the second part in Figure 4 is to rename every log file on the day that it was created, as we see down below.

    Figure 6: The name of the files

  • In the end part of Figure 4, we will use logger variable for configuring the format and how logs will be saved.


Let's move to the next lines:

Figure 7: Request Code

  • The tor_proxy variable and proxies function are responsible for sending the request to the TOR browser as we see in Figure 7.


You may be asking what request?

Will, we use proxies for other functions that are reasonable for scraping the data from the TOR browser.


  • From line 28 to line 30, we have three different arrays, to store the keywords that we want to monitor.


Note: If you want to monitor data for a large company that has many employees, you will need to use any database system and integrate it with the script instead of arrays.


The next step I took was to determine what dark web search engines I would use to search for it.

Also, see the HTML structure for every search engine site to see where exactly the results are shown and in which class.


Some examples of sites I worked on:

[Site 1] - Ahmia

Figure 8: Ahmia Search Engine


[Site 2] - Kraken

Figure 9: Kraken Search Engine


And many more…


Now let's search on any word, then check the page source to see the URL and description where they were restored exactly so we can parse them in our script because these two (URL and description) are the most important pieces of information we need to know if there are any data breaches.

Figure 10: View page source

Figure 11: The place of the URL and description


Now let's make a function that is responsible for parsing the URL and description of the site in our script with the help of the BeautifulSoup library.

Note that I added a parsing function to every search engine you add because the HTML structure differs from site to site.

Figure 12: parse_search_engine_1 function for Ahmia


Figure 13: parse_search_engine_2 function for Kraken


To parse the data and send it to Elk SIEM, we will create these metadata:

  • Entity_Type: Is it e-mail? phone? domain?

  • Url

  • Description

  • Search_engine_name

The purpose of this metadata is we can customize our dashboard in Kibana and show what we want from this information.

Figure 14: Parsing Metadata


Now, this is the main function of the script, passing the dark web search engine URL to every parsing function.

The loop will continue until it search for all keywords that we put in arrays in figure 7.

Figure 15: Main Function


Here is the final dashboard that will receive the .json files from the folders in Figure 5


Telegrams Groups

The next time, I will show my Python script that monitors the APT Telegram Groubs in the MENA region.

Figure 16: APT Telegram Groups


Dashboard Overview



Do you have any questions?

- Contact me on LinkedIn


Threat intelligence

SIEM

Python Script

TeamWave Website in Framer

Nowadays, every company should know where their data is, who can access it, and if anyone has a free trip with it without their knowledge. 


The dark web is a great place for APTs and hacker groups to announce their attacks proudly.


Cybersecurity companies took advantage of this behavior and created a tool that monitors the data and information of IT companies to see if they have been breached or not. 



This tool is called Dark Web Monitoring.



If the tool detects the data breach, the IT company has time to minimize the impact of the incidents early and minimize the losses. This is normal because no system is completely secure, and there's no guarantee that any solution will prevent all cyber attacks.


In my tenure at AXIS as a fintech company, we have seen the urgency of having this tool to detect any breach behavior that could happen to our customer data.


Instead of buying dark web monitoring, I was assigned to do it from scratch!


In this technical blog, we will learn how we created this tool using Python and ELK SIEM with GOD's favor and guidance. Also, with the help of my great team and the support of leadership.


Architecture Diagram

figure 1: Dark web Architecture Diagram


The logic and the sequence of this tool will be the next:


1. Create a Python script that takes specific keywords that you want to monitor [e-mails, phone numbers, company domains, etc.]


2. The Python script will take these keywords and search for them in three main sources:

  • Dark web search engines 

  • APTs Forums

  • APTs telegram group


3: The Python script will take the output of these sources and save it to log or JSON files.


4: Filebeat will send these files to Elastic Search to index and parse them, and then to Kibana for monitoring the data in a beautiful dashboard created by me 😎

figure 2: Final Dashboard


(Note: This is not real data. It is just for testing purposes.)


easy beze.. right! Let's go more deep.

Technical Details

First, I imported these libraries, and each one of them have a specific role in our script as we see in figure3.

Figure3: libraries


Figure 4: The first lines of code

  • The first part is about creating folders so we be able to save the logs in an organized way as we see in Figure 5 below.

Figure 5: Organized Folders To save the logs

  • The purpose of the second part in Figure 4 is to rename every log file on the day that it was created, as we see down below.

    Figure 6: The name of the files

  • In the end part of Figure 4, we will use logger variable for configuring the format and how logs will be saved.


Let's move to the next lines:

Figure 7: Request Code

  • The tor_proxy variable and proxies function are responsible for sending the request to the TOR browser as we see in Figure 7.


You may be asking what request?

Will, we use proxies for other functions that are reasonable for scraping the data from the TOR browser.


  • From line 28 to line 30, we have three different arrays, to store the keywords that we want to monitor.


Note: If you want to monitor data for a large company that has many employees, you will need to use any database system and integrate it with the script instead of arrays.


The next step I took was to determine what dark web search engines I would use to search for it.

Also, see the HTML structure for every search engine site to see where exactly the results are shown and in which class.


Some examples of sites I worked on:

[Site 1] - Ahmia

Figure 8: Ahmia Search Engine


[Site 2] - Kraken

Figure 9: Kraken Search Engine


And many more…


Now let's search on any word, then check the page source to see the URL and description where they were restored exactly so we can parse them in our script because these two (URL and description) are the most important pieces of information we need to know if there are any data breaches.

Figure 10: View page source

Figure 11: The place of the URL and description


Now let's make a function that is responsible for parsing the URL and description of the site in our script with the help of the BeautifulSoup library.

Note that I added a parsing function to every search engine you add because the HTML structure differs from site to site.

Figure 12: parse_search_engine_1 function for Ahmia


Figure 13: parse_search_engine_2 function for Kraken


To parse the data and send it to Elk SIEM, we will create these metadata:

  • Entity_Type: Is it e-mail? phone? domain?

  • Url

  • Description

  • Search_engine_name

The purpose of this metadata is we can customize our dashboard in Kibana and show what we want from this information.

Figure 14: Parsing Metadata


Now, this is the main function of the script, passing the dark web search engine URL to every parsing function.

The loop will continue until it search for all keywords that we put in arrays in figure 7.

Figure 15: Main Function


Here is the final dashboard that will receive the .json files from the folders in Figure 5


Telegrams Groups

The next time, I will show my Python script that monitors the APT Telegram Groubs in the MENA region.

Figure 16: APT Telegram Groups


Dashboard Overview



Do you have any questions?

- Contact me on LinkedIn


Threat intelligence

SIEM

Python Script

TeamWave Website in Framer

Nowadays, every company should know where their data is, who can access it, and if anyone has a free trip with it without their knowledge. 


The dark web is a great place for APTs and hacker groups to announce their attacks proudly.


Cybersecurity companies took advantage of this behavior and created a tool that monitors the data and information of IT companies to see if they have been breached or not. 



This tool is called Dark Web Monitoring.



If the tool detects the data breach, the IT company has time to minimize the impact of the incidents early and minimize the losses. This is normal because no system is completely secure, and there's no guarantee that any solution will prevent all cyber attacks.


In my tenure at AXIS as a fintech company, we have seen the urgency of having this tool to detect any breach behavior that could happen to our customer data.


Instead of buying dark web monitoring, I was assigned to do it from scratch!


In this technical blog, we will learn how we created this tool using Python and ELK SIEM with GOD's favor and guidance. Also, with the help of my great team and the support of leadership.


Architecture Diagram

figure 1: Dark web Architecture Diagram


The logic and the sequence of this tool will be the next:


1. Create a Python script that takes specific keywords that you want to monitor [e-mails, phone numbers, company domains, etc.]


2. The Python script will take these keywords and search for them in three main sources:

  • Dark web search engines 

  • APTs Forums

  • APTs telegram group


3: The Python script will take the output of these sources and save it to log or JSON files.


4: Filebeat will send these files to Elastic Search to index and parse them, and then to Kibana for monitoring the data in a beautiful dashboard created by me 😎

figure 2: Final Dashboard


(Note: This is not real data. It is just for testing purposes.)


easy beze.. right! Let's go more deep.

Technical Details

First, I imported these libraries, and each one of them have a specific role in our script as we see in figure3.

Figure3: libraries


Figure 4: The first lines of code

  • The first part is about creating folders so we be able to save the logs in an organized way as we see in Figure 5 below.

Figure 5: Organized Folders To save the logs

  • The purpose of the second part in Figure 4 is to rename every log file on the day that it was created, as we see down below.

    Figure 6: The name of the files

  • In the end part of Figure 4, we will use logger variable for configuring the format and how logs will be saved.


Let's move to the next lines:

Figure 7: Request Code

  • The tor_proxy variable and proxies function are responsible for sending the request to the TOR browser as we see in Figure 7.


You may be asking what request?

Will, we use proxies for other functions that are reasonable for scraping the data from the TOR browser.


  • From line 28 to line 30, we have three different arrays, to store the keywords that we want to monitor.


Note: If you want to monitor data for a large company that has many employees, you will need to use any database system and integrate it with the script instead of arrays.


The next step I took was to determine what dark web search engines I would use to search for it.

Also, see the HTML structure for every search engine site to see where exactly the results are shown and in which class.


Some examples of sites I worked on:

[Site 1] - Ahmia

Figure 8: Ahmia Search Engine


[Site 2] - Kraken

Figure 9: Kraken Search Engine


And many more…


Now let's search on any word, then check the page source to see the URL and description where they were restored exactly so we can parse them in our script because these two (URL and description) are the most important pieces of information we need to know if there are any data breaches.

Figure 10: View page source

Figure 11: The place of the URL and description


Now let's make a function that is responsible for parsing the URL and description of the site in our script with the help of the BeautifulSoup library.

Note that I added a parsing function to every search engine you add because the HTML structure differs from site to site.

Figure 12: parse_search_engine_1 function for Ahmia


Figure 13: parse_search_engine_2 function for Kraken


To parse the data and send it to Elk SIEM, we will create these metadata:

  • Entity_Type: Is it e-mail? phone? domain?

  • Url

  • Description

  • Search_engine_name

The purpose of this metadata is we can customize our dashboard in Kibana and show what we want from this information.

Figure 14: Parsing Metadata


Now, this is the main function of the script, passing the dark web search engine URL to every parsing function.

The loop will continue until it search for all keywords that we put in arrays in figure 7.

Figure 15: Main Function


Here is the final dashboard that will receive the .json files from the folders in Figure 5


Telegrams Groups

The next time, I will show my Python script that monitors the APT Telegram Groubs in the MENA region.

Figure 16: APT Telegram Groups


Dashboard Overview



Do you have any questions?

- Contact me on LinkedIn


Threat intelligence

SIEM

Python Script

TeamWave Website in Framer

Nowadays, every company should know where their data is, who can access it, and if anyone has a free trip with it without their knowledge. 


The dark web is a great place for APTs and hacker groups to announce their attacks proudly.


Cybersecurity companies took advantage of this behavior and created a tool that monitors the data and information of IT companies to see if they have been breached or not. 



This tool is called Dark Web Monitoring.



If the tool detects the data breach, the IT company has time to minimize the impact of the incidents early and minimize the losses. This is normal because no system is completely secure, and there's no guarantee that any solution will prevent all cyber attacks.


In my tenure at AXIS as a fintech company, we have seen the urgency of having this tool to detect any breach behavior that could happen to our customer data.


Instead of buying dark web monitoring, I was assigned to do it from scratch!


In this technical blog, we will learn how we created this tool using Python and ELK SIEM with GOD's favor and guidance. Also, with the help of my great team and the support of leadership.


Architecture Diagram

figure 1: Dark web Architecture Diagram


The logic and the sequence of this tool will be the next:


1. Create a Python script that takes specific keywords that you want to monitor [e-mails, phone numbers, company domains, etc.]


2. The Python script will take these keywords and search for them in three main sources:

  • Dark web search engines 

  • APTs Forums

  • APTs telegram group


3: The Python script will take the output of these sources and save it to log or JSON files.


4: Filebeat will send these files to Elastic Search to index and parse them, and then to Kibana for monitoring the data in a beautiful dashboard created by me 😎

figure 2: Final Dashboard


(Note: This is not real data. It is just for testing purposes.)


easy beze.. right! Let's go more deep.

Technical Details

First, I imported these libraries, and each one of them have a specific role in our script as we see in figure3.

Figure3: libraries


Figure 4: The first lines of code

  • The first part is about creating folders so we be able to save the logs in an organized way as we see in Figure 5 below.

Figure 5: Organized Folders To save the logs

  • The purpose of the second part in Figure 4 is to rename every log file on the day that it was created, as we see down below.

    Figure 6: The name of the files

  • In the end part of Figure 4, we will use logger variable for configuring the format and how logs will be saved.


Let's move to the next lines:

Figure 7: Request Code

  • The tor_proxy variable and proxies function are responsible for sending the request to the TOR browser as we see in Figure 7.


You may be asking what request?

Will, we use proxies for other functions that are reasonable for scraping the data from the TOR browser.


  • From line 28 to line 30, we have three different arrays, to store the keywords that we want to monitor.


Note: If you want to monitor data for a large company that has many employees, you will need to use any database system and integrate it with the script instead of arrays.


The next step I took was to determine what dark web search engines I would use to search for it.

Also, see the HTML structure for every search engine site to see where exactly the results are shown and in which class.


Some examples of sites I worked on:

[Site 1] - Ahmia

Figure 8: Ahmia Search Engine


[Site 2] - Kraken

Figure 9: Kraken Search Engine


And many more…


Now let's search on any word, then check the page source to see the URL and description where they were restored exactly so we can parse them in our script because these two (URL and description) are the most important pieces of information we need to know if there are any data breaches.

Figure 10: View page source

Figure 11: The place of the URL and description


Now let's make a function that is responsible for parsing the URL and description of the site in our script with the help of the BeautifulSoup library.

Note that I added a parsing function to every search engine you add because the HTML structure differs from site to site.

Figure 12: parse_search_engine_1 function for Ahmia


Figure 13: parse_search_engine_2 function for Kraken


To parse the data and send it to Elk SIEM, we will create these metadata:

  • Entity_Type: Is it e-mail? phone? domain?

  • Url

  • Description

  • Search_engine_name

The purpose of this metadata is we can customize our dashboard in Kibana and show what we want from this information.

Figure 14: Parsing Metadata


Now, this is the main function of the script, passing the dark web search engine URL to every parsing function.

The loop will continue until it search for all keywords that we put in arrays in figure 7.

Figure 15: Main Function


Here is the final dashboard that will receive the .json files from the folders in Figure 5


Telegrams Groups

The next time, I will show my Python script that monitors the APT Telegram Groubs in the MENA region.

Figure 16: APT Telegram Groups


Dashboard Overview



Do you have any questions?

- Contact me on LinkedIn


Let's Talk

Let's Talk

Let's Talk

Let's Talk

© 2023. All rights Reserved.

© 2023. All rights Reserved.

© 2023. All rights Reserved.

© 2023. All rights Reserved.