No. 7.Practical Hacking and Social Engineering (Detection, Measurement and Reporting)

3 July 2023 29 minutes Author: Cyber Witcher

Detection, Measurement and Reporting in Social Engineering

Detection, measurement and reporting in social engineering are an integral part of security strategies and protection against fraud and social attacks. This process involves systematically identifying potential threats, assessing their risk, measuring the effectiveness of security measures, and generating reports for analysis and subsequent strategic decisions. Social engineering detection involves detecting suspicious activity, which may include phishing attempts, attempts to gain illegal access to systems, or attempts to manipulate users to obtain sensitive information. This may include monitoring social media, analyzing user behavior, using threat detection techniques, and other methods.

Measurement in social engineering involves assessing risk, the effectiveness of security measures, and identifying potential vulnerabilities. The most valuable part of the social engineering process is also often underestimated or ignored. In this section, you’ll learn about the discovery, measurement, and reporting steps. Often, part of helping a client organization is teaching them to identify themselves. In the measurement phase, you should collect statistical indicators of your successes, as well as other key performance indicators. You will use these statistics to create a professional, easy-to-understand report. We will discuss different ways to measure the results of an attack. These include looking at different metrics that can make your data understandable to your customers. The most useful of these metrics go beyond simply counting how many emails were opened or links clicked. Finally, I will explain how to write a useful, interesting report that is understandable to the managers of the client organization.


While all pentesters, and social engineers in particular, are prone to unauthorized access, ethical pentesters must also expect to be detected. Clients don’t pay you to destroy your company’s infrastructure and humiliate your employees. Rather, they need an understanding of their company’s weaknesses and advice on how to overcome them.

So you have to strike a middle ground: challenge your victims, but do it honestly. The quality of training of the customer’s personnel is beyond your control. The third chapter of the book is devoted to this. The part that depends on you is the structure of your commitments. By using attacks of various levels of complexity, you can give employees the opportunity to detect you and notify security.

When determining the scope of interaction with the victim, you need to come to a clear understanding with the client about how secretive you should be. If you’re making this attack to make decisions about a staff development program, you’ll probably want to be very stealthy or incredibly loud, depending on the level of preparedness of the organization you’re working with (although the final decision is up to the client). A mature organization may be able to handle a covert operation but still insist on a rough and tumble operation, or it may be too immature to benefit from covert operations, even if management prefers that option.

If you operate covertly, you can get an accurate picture of what an experienced adversary inside the organization is capable of. Covert transactions can be a good guide to understanding risks if they have a strong rationale. Organizations that already have awareness and training programs in place can benefit the most. Often, such attacks are best suited to a hostile campaign to simulate a red team intrusion or, to a lesser extent, a penetration test. It is also permissible for the organization to conduct such events independently.

Open tests are a great first point of contact for an organization completely new to adversarial simulation and social engineering. If the purpose of competitive simulation is to test the company’s procedures, open operations can be the optimal solution, which will save you and the client a significant amount of time.


You need to use metrics to measure the success of your engagement. But which indicators are important? How do you measure them? Do you need to take a course in statistics or get a degree in data science?

Indeed, some knowledge of statistics can be useful, and in certain situations – for example, if you want to evaluate which of the company’s departments most often fell victim to social engineering attacks or which schemes and terms turned out to be the most effective – an understanding of data science concepts such as regression and cluster analysis definitely won’t hurt. However, in most cases these skills are not needed at all. Also keep in mind that if you plan to do statistical research, you’ll need a significant data set (thousands of phishing emails, if not millions) and, more importantly, customer consent.

Selection of indicators

When choosing indicators, try to be as practical as possible. What news about customers can be on the front page of the local newspaper? What can lead to problems with the law? In most cases, simply opening an email will not result in a negative result, so this metric may not be very useful to measure. Following links, disclosing confidential information or hiding the fact of attacking someone leads to negative consequences.

While you may have your own ideas about which metrics to consider, knowing which metrics the client considers important is also critical. Based on this, you can organize the data in a way that helps the customer understand it.

Ratios, medians, means, and standard deviations

In order to meaningfully present data, you need to know how to calculate the following values: ratios, medians, averages, and standard deviations. These operations require at least 30 data points to be statistically significant.

Ratios, in simple terms, tell you “how much X is in Y” as a percentage. In other words, if you send a phishing email to 100 employees of a company and 19 of them click on the link, you have a 19 out of 100 or 19% chance that someone will click on the link (this ratio is also called clickrate).

The median is the most central data point for a value. If you arrange the data points in order from smallest to largest, the median will be equidistant from both ends of the line. For example, you performed three phishing attacks on a customer with 100 users. 62 users became victims of the first interaction. The second had 34, and the third had 19. Let’s put these data points in order, from smallest to largest. If you know how to program, consider this data as an ordered array [19, 34, 62]. The median is 34 because it is the value in the middle.

The mean, on the other hand, is the mathematical average of all the data points. Using the same three participations, you can add up all three values and divide the result by 3 (since there are three values), getting an average of 38.33.

To talk about the nature of a set of values as a whole, we use the standard deviation – a measure of the variation of a set of values. Simply put, the standard deviation tells you how much each value differs from the mean. I could bore you with the actual equation, but the upside is that Excel and most spreadsheet programs will calculate it for you. A low standard deviation means that data points in a set are more similar than those with a high standard deviation. Having these data points will help the client understand the overall health of the organization in terms of the behavior of its employees.

For example, given the data set 1, 1, 1, 1, 5, 7, 24, the following values can be generated:

  • Median: 1,

  • Average: 5,714,

  • Standard deviation: 8,420.

From these statistics, you can draw a generalized conclusion for the client about how his employees coped with the test. For example, the standard deviation of 8.420 in this example shows that the numbers in the set are varied, which probably means that people behaved quite differently. This standard deviation makes more sense if we go back to the original data and consider the large difference between the largest number, 24, and the next largest number, 7. Changing 24 to 12, reducing the variation in the data set, reduces the standard deviation significantly to 4.281.

The number of email opens

Email opens are a relatively minor metric. When used in conjunction with clicks and data entry on phishing sites, more useful information can be obtained, such as a ratio that illustrates the speed with which non-technical personnel can identify phishing emails by subject. However, clients often focus on the number of discoveries. I always try to argue against that.

Emails are intended to be opened, and while some may contain malware, applying this metric may encourage users to avoid legitimate emails. The question of whether an e-mail contains malware is decided by the mail administration and security teams, who are responsible for monitoring incoming e-mail messages for attachments. While users must do their part to keep the organization secure, the company’s accountant is not an expert in information security or malware. If phishing attacks use realistic subject lines and background information, how can users tell that an email is phishing without opening it?

There is an important caveat: more advanced attackers can use a browser-side exploit to collect metadata from browsers if a user opens an email. These are real threats, but if a company can’t counter basic phishing, it likely lacks the skills to mitigate such more sophisticated attacks. This illustrates a topic we will cover in Chapter 10, Defense in Depth. Do not rely solely on training or only on technical control. Try to train staff, but at the same time implement email security tools in addition to anti-malware.

Instead of focusing solely on the number of email opens, focus on the following useful metrics:

  • The interval before the first report. The time of the first report to the security service minus the time of the first opening of the letter.

  • Share of open letters. The number of emails sent divided by the number of opened.

Interval to first report measures the time between when the first email was opened and when the security team became aware of the mailing. This metric is important because it indicates the amount of time the security team has to prevent serious phishing attacks. After receiving a suspicious email message, the security team can review the email, investigate the associated website in an isolated environment, and then begin taking defensive actions. These actions may include work to remove the site and clean the link (making changes to internal DNS configurations to redirect users to a safe place when clicking on it). A long interval before the first report may indicate that many people opened the email before the security team knew about it, which also leaves them less time to act due to the possibility that someone will click on the link and not report it as many people already did.

Open rate measures user engagement from a reporting perspective. How many of those who opened the email reported it to security? This metric shows whether a cooperative relationship exists between users and the security group. It also speaks to users’ ability to recognize a phishing attempt.

Number of clicks

The number of clicks  (clicks) on a link in an email is one of the most important metrics. Clicking on the link may lead to malware infection, file downloads, or leakage of sensitive information such as passwords through fake forms. However, while this metric is more important than open rate, it is not the most important.

It is much more valuable to combine the number of clicks with other data, such as the time from the first click to the first report to the security service, or measuring the “click-to-message” and “click-to-enter” rates. The point is that it’s important to understand how well the organization can respond to clicks, not just whether users will follow the link.

While it’s true that users shouldn’t click on these links in the first place, I’ll reiterate that it’s the security team’s responsibility to protect the company if they do. Mail administrations, information security teams and users must work together to ensure that when a link is clicked, the system protects the user. The burden of protecting the organization should not fall entirely on an untrained user who does not need to know all the technical details.

Useful metrics related to clicks include the following:

  • The time interval between the transition and the message. The time of the first security message minus the time of the first transition.

  • Conversion rate. The number of security messages divided by the number of transitions.

  • Input factor. The number of cases of entering information (for example, in a form), divided by the number of transitions.

  • Rate of input messages. The number of cases of entering information, divided by the number of security service messages.

Like open-to-notify, bounce-to-notify measures the time between when the first email is clicked and when it is reported to security. This figure again reflects the time the security team has to mitigate the effects of phishing. Also, like alert open rate, bounce rate measures user engagement from a reporting perspective.

Note the discrepancy between the referral and open reports. Ideally, a higher level of discovery is more beneficial to the organization because it will make the security team aware of the attack earlier and have more room for action. At best, organizations should have a high alert level for phishing email opens and a minimal set of data against which to evaluate click reports. Having a conversion alert rate that is higher than or equal to the open rate indicates that people are opening emails and then clicking on links before reporting a suspicious email to security.

The input rate represents the number of people who clicked on a link sent in an email and entered information on the associated website. Similarly, the frequency of incoming alerts compares the number of information inputs to the number of security calls. In an ideal world, we would have an input factor of zero (because zero divided by something equals zero). This would mean  that no one entered any information despite opening the email and clicking on the link.

Otherwise, we need a high frequency of incoming notifications. He points out that people notice when they make mistakes and are comfortable admitting to this security team. It is better for the user to report the link without fear of punishment, instead of remaining silent while malicious activity occurs. It’s easier for security professionals to protect themselves against actions they know about than to be overwhelmed by sudden consequences they could have prevented.

Entering information into forms

The nature of the information that users enter into forms is another important indicator. Users can enter passwords, e-mail addresses and other sensitive information in this form, and without a reliable Security and Privacy Center that actively monitors users’ systems and their activities on the local network and the global Internet, an organization cannot feel safe. When reporting on this metric, do not disclose the actual passwords or data collected in the report. If you need to share information, try instead to only share a list of users who need to update their passwords. Also, it’s best to do it outside of a formal report.

Useful indicators related to information include the following:

  • Input factor. The number of information inputs divided by the number of links.

  • Rate of input messages. The number of information inputs divided by the number of messages.

  • Reliability coefficient. The number of valid credentials entered divided by the number of credentials entered.

  • Coefficient of breakage. The number of users who entered information and ended up in the Have I Been Pwned database divided by the total number of users who entered information.

To calculate the confidence factor, you need to know the hashes of the user’s real credentials. If the customer’s security team is willing to provide you with hashes of their users’ passwords, you can hash the information entered in the form using the same hashing algorithm and then compare the two hashes to verify that users are entering valid information. This can show you the number of people who entered valid information on a phishing site, compared to how many people entered false information, either accidentally or on purpose, out of a trolling desire.

If the organization conducts phishing attacks on employees too often or links test results to performance evaluations, employees sometimes intentionally enter false information or even data from other employees. While we shouldn’t encourage employees to enter anything at all, training them to enter false information can benefit the organization in two ways: If the organization has established a standard set of false information (email address, name, phone number, password), members of the cybersecurity team can monitor its use, check whether the site accepts false information, and use it (in the event of a leak) to identify subjects. unauthorized access attempt. Lack of validation can distort the statistical analysis and results of your report, giving the impression that the organization is performing worse than it actually is.

A little OSINT is required to calculate the  hack rate. However, this metric is useful because it reflects the impact of social engineering attacks on user behavior outside of the attack being executed. Using the victims’ work email addresses, see how many are listed in the Have I Been Pwned database (presented in Chapter 6). Compare the result with the total number of users who entered the information. If you are going to find users in the database Depending on the part of the breach, you may want to include a clause about the permissibility of using company e-mail in your employee development guidelines and advise establishing a firm security policy to prevent similar incidents in the future. Identifying employees in the database indicates that they may behave recklessly online and pose a risk to the organization.

This indicator is likely to contain biased information that distorts the result. Some people will use their work email address on all sorts of sites and either not get caught or hacked, hence the bias and bias. In addition, people often use the same passwords at home and at work. But if you don’t measure all the hacks and password thefts a user has experienced, the numbers will be incorrect. An employer may not conduct an investigation into personal lives, which includes household passwords, without the consent of employees. I’m always shy about asking for this kind of consent, but if you had an employee’s consent to search the databases for issues with their personal accounts, you could remove a lot of the bias in this metric, as you would be able to assess the overall security status of each employee .

Actions of the victim

Actions taken by the victim may include opening the email, deleting the email, forwarding it to technically incompetent or off-duty personnel, forwarding it to security, clicking on links in the email, entering information, or reporting it (whether or not the employee is the victim ). It is very important to understand what users do after they become a victim. Do they report it to management, try to hide the fact or do nothing? Including this information in the report will require data entry from the client, but obtaining this information is usually not difficult. In my experience, I got what I wanted when I just asked for it.

Detection time

How long does it take for an organization’s security team to detect a phishing attempt? Has the organization been notified via user reports, an app or email service, or the SEIM system? The time it takes to detect an event speaks volumes about the maturity of the organization and its information security capabilities. Long before detection, and whether detection occurs at all, is indicative of how catastrophic an attack can be.

Depending on whose report you read and the motivations of the report author, latency (the amount of time an attacker can take action in an environment without triggering detection or other countermeasures) varies from days to years. A shorter waiting time means an attacker has less time to gain a foothold, directly harm the organization, or give it a bad name.

We have discussed useful metrics for measuring detection time in other sections of this chapter, so we will not repeat them here.

Timeliness of corrective actions

How quickly does the organization take corrective action? The sooner the better. This indicator reflects the organization’s ability to respond to incidents. The following measurements help determine how well an incident response team can perform its tasks.

  • Time interval to reaction. The time when the response action began, minus the time the email was first opened.

  • The interval between transition and reaction. The time when the corrective action began minus the time of the first transition.

The first metric is the interval between the first time an email is opened or a link is clicked, as appropriate in the context, and the time the response takes effect. Appropriate actions include, but are not limited to, blocking the link, blocking the sender, initiating site removal, and informing users of the attack. These indicators do not say anything about whether the actions taken were adequate (this is the next indicator). Again, a lot depends on the timing of the response. It is important to respond appropriately to incidents, but action means nothing if it is not done in time.

Response efficiency

No less important than the timeliness of the response is its effectiveness. If the actions of the defenders stop the attack, great. However, in some cases, retaliation can amplify an attack and make it work in the attacker’s best interest.

When completing one of the tasks, I was blocked after sending about 50% of the letters. I sent them in batches of 7-15 people at a time. Only about 20% of recipients followed my link, and only 6% entered any information. I thought I had failed miserably.

The next morning, I logged in and saw that 42% of the organization’s employees had entered data into the phishing form, some even did it twice or more. Why did this happen?

The network administrator who blocked me forwarded the email to the entire organization without hiding or blocking the link in the email. It created a Streisand-like effect; Trying to warn people about the mailing, the administrator showed it with his own hands to a large number of people, many of whom clicked on the link out of curiosity. As the old English saying goes, curiosity killed the cat.

Quantitative assessment of risk

Quantifying risk isn’t easy, but it’s important to do so because the report you send to your client should organize your findings based on severity. Different methodologies can be used to assess risk, both qualitative (using subjective labels such as “critical”, “high”, “medium”, “low” and “informative”) and quantitative (for example, on a scale from 0 to 10). Two such methodologies are the OWASP Risk Assessment Methodology and the Common Vulnerability Assessment System (CVSS).

Unless your employer or client clearly requires a quantitative risk assessment, I recommend sticking with a qualitative one. Attempting to perform quantitative analysis requires all data points to be in a numerical format, and translating some data points into an interpretable quantitative format involves unnecessary complexity. Most of our metrics are quantitative, but we cannot easily and unambiguously convert all user actions into numerical values. For example, forwarding an email has nothing to do with deleting an email. By assigning numerical values to these actions, we mean that there is a connection between them, which does not exist. When determining risk severity, consider the likelihood and severity of an incident, then assign reasonable weights to these factors to arrive at a single score.

Next, determine which risk should be considered critical, high, medium, low, and informational. Below are standard definitions that you can use in your report as you see fit.


These are risks that can lead to catastrophic consequences, prolonged downtime or the cessation of all operations. Usually, such threats are implemented to the maximum extent and at one time. Critical level incidents often become public knowledge and have a significant impact on an organization’s ability to conduct business. They can also be life-threatening. In the case of an information security incident, it can be a leak of restricted or sensitive data, such as personal data or protected health information, which is what happened at Equifax and similar organizations. High

These risks can lead to costly or serious downtime, damage or interruptions. The entry barrier for penetration and influence is low. These have major implications and can affect sensitive or restricted data, although to a lesser extent than critical risks.


These risks may lead to disruptions or problems in the customer’s organization, but not major downtime. This includes, for example, gaining access to systems that can be used to navigate to other systems or objects, as well as leaking non-public data that is not particularly sensitive.


Incidents from this group represent a small risk for the customer. The implementation of such threats may depend on other factors, such as local physical access to the network, or require that another vector of exploitation has already been executed. These risks are associated with minimal disruption to the organization’s activities in the event of success.


Currently, such incidents do not pose a risk, but do not meet advanced security requirements or may become risky later.


This section will help you write a finished report for your client. While not as exciting as the attack itself, clients are actually paying you big bucks for the report. However, making it useful is not an easy task. The truth is that some customers will read the report carefully, while others will put it on the shelf without even looking at it. If people don’t read the report, how can they draw conclusions and correct the flaws? This section offers an answer by looking at the problem from two perspectives: the finished report that the customer needs to read, and the situations that require a stop and call to the customer.

Learn when to call

The report is not the only tool of communication with the client.

Critical computing resources are at risk. For example, if you discover an attacker on a customer’s network or another suspicious condition that may worsen over time, notify the customer immediately. For anything else related to the attack you commissioned, feel free to provide brief follow-ups by email or phone, but be sure to clarify that nothing in these communications is official or legally binding. Otherwise, you could find yourself in court if the information in the notices contradicts the information in your final report. The report should be the main, and ideally the only, formal communication between you and the client after the conclusion of the contract.

Writing a report

I advise you to write a report as you go so that you don’t miss details and dig through your notes after you’re done. As Chris Sanders, author of Practical Package Analysis (No Starch Press, 2017), says, your report should be clear and concise, but it should also tell a story. By using narrative techniques, you can engage your readers more effectively to entice them – perhaps even through social engineering – to read the report in its entirety.

What do I mean by narrative techniques? Explain what steps you took and why they were important. Make it sound like you’re acting like a real badass. Tell about what you saw, your analysis of the situation and the results. A little trick: to captivate readers, use active voice in your writing instead of passive. For example, the phrase “our consultants have compiled a list of dangerous site vulnerabilities” is active voice. The phrase “found to be able to compile a list of vulnerabilities” is in the passive voice, and it doesn’t sound good. Therefore, the first version of the phrase is more preferable.

Depending on whether you are self-employed or employed, reporting time may be billed at a lower rate than the task itself, or may not be billed at all. Don’t use all the time allotted for a report just for the sake of using it. Do only what you really need. Consideration should also be given to the time it will take for the documents to be reviewed (for example by editors, legal teams or quality assurance specialists).

Structure of the report

To begin, choose a report template – one provided by your employer, one of the application templates 2 found on the Internet, or designed by you from scratch. In this section, we will use the template from Appendix 2.

In the Rationale section, explain the parameters that limit the testing and the reasons why it was performed. Specify here the scope of work in accordance with the TOR, the testing rules provided by the customer, and the parameters that you must comply with. This section should be no longer than a page, ideally one or two paragraphs.

The main outline of the report follows. Use this section to give an overview of what you did, what you found, and how you evaluate it. You can also add general troubleshooting tips. Don’t go into too much detail, as you should assume that the audience in this section is non-technical and wants a slightly more immersive reading experience than a washing machine manual.

After the summary, there should be a section that outlines your main conclusions. This is where you identify the main issues that should concern the customer. Using a risk scoring system similar to the one described in Risk Quantification, determine which findings deserve a Critical or High Risk rating and include only those. (All other findings are listed in the general Conclusions section later in the document.) For each major finding, explain what the finding is, how it can be used, what the potential outcomes are, how to test it independently, and how to fix the problem. Do your best to convey the seriousness of these findings to your audience. In conversations with executives, I found it helpful to describe the risk by pointing to specific negative consequences, such as being in the federal news or being accused of negligence in court. Such specific details can help attract the attention of managers.

The next section should detail the OSINT information you have collected. For each piece of information, provide a brief description of the tool you used or a screenshot of the information as proof. If the data is available online, you can also provide a link. Without supporting data, customers will not be able to verify the veracity of the information you provide.

If your screenshots contain information that could harm you in any way, you can encrypt the document when it is sent to the client. I have also worked with clients who did not accept reports in digital form. They demanded that the report be sent by regular mail, in the form of a printout on paper, so as not to leave an “electronic trail” if anything related to the report ends up in court. Although you do not need to label the risks in this section, as significant risks will be listed and labeled in the conclusions section, highlight critical and high-level risks in bold to draw attention to them.

The OSINT section is followed by a social engineering section that describes your actual involvement. If you used multiple types of social engineering, take subheadings to break this section down into each type of interaction: phishing, vishing, and on-site testing. If you use a hybrid interface, meaning your phishing and vishing are linked together, add them to the “Hybrid Experience” subsection. In each subsection, explain all the prepositions used to interact with the target. Explain what you did and what the results were. Then use the metrics described in the Measurement section to explain the impact of the results. If you’ve worked with a client before, you can also compare the results of this interaction with the previous one so the client can see their progress.

Move all search results to the next section. You can be verbose here. Explain the problem, how you found it, the signs you found it, links explaining why it’s a problem, how to fix it completely, and possible ways to compensate if a full fix isn’t possible. This section will repeat the content of the review section and conclusions. In a one-paragraph format, explain how you can fix the problems. At this point, complete the corrections and recommendations section.

Make recommendations for staff training, technical solutions or other changes in the company’s culture. The third part of this book discusses such remedies.

Keep in mind that you are just recommending. You have no authority to demand any changes, and the customer may or may not resolve the issue. While you may feel a sense of ownership in the project, ultimately it’s not your problem if the client chooses not to listen to your advice.

Text quality control

I recommend that several people under non-disclosure agreements review the report before providing it to clients. One person should check all the technical aspects of the report, while another should check the grammar, spelling, and style. Your reviewers should also assess the length of the report to ensure that it is detailed enough to get the point across without being too verbose. As Frances So, the editor of this book, can attest, I struggle with this all the time. This is what many social engineers do. And as my publisher Bill Pollock has pointed out, social engineers use their ability to talk incessantly as a tool. When communicating with people who are far from technology and not related to security, this strength becomes a disadvantage.

We used materials from the book “Social Engineering and Ethical Hacking in Practice”, which was written by Joe Gray.

Other related articles
Found an error?
If you find an error, take a screenshot and send it to the bot.