How Successful are Targeted Phishing Attacks - A Real World example

WHITEPAP E R WHITEPAP E R How Successful are Targeted Phishing Attacks - A Real World example ThreatMetrix® Labs Report May 2015 Author: Andreas Bau...
Author: Malcolm Moore
9 downloads 0 Views 1013KB Size
WHITEPAP E R

WHITEPAP E R

How Successful are Targeted Phishing Attacks - A Real World example ThreatMetrix® Labs Report May 2015 Author: Andreas Baumhof, CTO, ThreatMetrix Inc.

WHITEPAP E R

Contents Introduction ................................................................................................................................... 4 Key Takeaways ............................................................................................................................... 4 The convergence of Phishing and Malware ................................................................................... 5 Targeted Phishing Attack ............................................................................................................... 5 How to remove obviously fake entries ....................................................................................... 7 How the Internet makes it easy to "enrich" stolen PII with other available PII ............................ 8 Reverse Telephone Lookup ........................................................................................................ 8 Match rates ............................................................................................................................. 8 Social Networking Sites .............................................................................................................. 9 Example 1 ................................................................................................................................ 9 Example 2 .............................................................................................................................. 10 Example 3 .............................................................................................................................. 10 Example 4 .............................................................................................................................. 11 Example 5 .............................................................................................................................. 11 Data .............................................................................................................................................. 11 How accurate is the Geolocation of IP addresses? .................................................................. 11 Operating System ..................................................................................................................... 13 Password stats (strength...) ...................................................................................................... 14 Common phrases used .......................................................................................................... 15 Password Strength................................................................................................................. 15 Password Length ................................................................................................................... 16 Password distribution............................................................................................................ 17

Page 2

WHITE P AP E R

Demography ............................................................................................................................. 17 Gender ................................................................................................................................... 17 Age ......................................................................................................................................... 18 Conclusion .................................................................................................................................... 18 Appendix A ................................................................................................................................... 19 Appendix B ................................................................................................................................... 19 Appendix C ................................................................................................................................... 19 More Information ........................................................................................................................ 19

Page 3

WHITEPAP E R

Introduction Personal Information is being lost everywhere. Some called 2014 the year of the data breaches and 2015 the year of the mega data breaches. At the same time, we spend a lot of time looking at really sophisticated malware attacks, but how successful are phishing attacks in 2015? Well, it turns out they are very successful by every measure. And phishing attacks are well and truly alive. They are certainly not these horrible looking and poorly worded websites anymore, and sophisticated Trojans such as the Dyre Trojan combine social engineering, sophisticated malware attacks and classical phishing attacks into one hell of an attack. So what can a fraudster expect in 2015 when running a sophisticated phishing attack? How much personal information are people willing to provide – if convinced properly? How easy is it to enrich the data with other data sources (either public or private)? This ThreatMetrix Labs report looks behind the scenes of one such phishing campaign1 in detail, and the results are shocking.

Key Takeaways This research confirms one of the best known secrets in the industry: Targeted attacks produce high quality results. This phishing attack wasn’t a poorly written website spammed out to millions of Internet users around the world. This phishing attack was very targeted and as such the quality of the data is very high. Some key takeaways: 

It is mind-blowingly easy to remove the fake phishing entries. In fact just three simple rules eliminated 100 percent of the fake entries, but they still left 17 percent genuine and high-quality entries – which was far beyond our expectations. o

We have found publicly available databases to confirm that the remaining good data is indeed valid in 92 percent of the cases!



It is very easy as a fraudster to "enrich" stolen data with other available data sources (either publicly available sources such as social media, or other data breach databases).



IP geolocation is surprisingly accurate. The average distance between the geolocation of the IP address and the geolocation of the mailing address is just 63 miles and in more than 50 percent of the cases, the distance is less than 10 miles.

1

We have notified the relevant financial institutions with this information immediately upon getting access to this information to make sure the accounts of the victims can be protected.

Page 4

WHITE P AP E R



The chosen passwords from the victims are a mess: More than 98 percent of the passwords used fall under the category of "shouldn't be used for anything serious."



Almost 25 percent of the victims responded to the phishing attack on their mobile phone – which is very high considering that the phishing attack forced the victims to respond to 22 questions!

The convergence of Phishing and Malware Phishing got popular after 2001, mainly targeting the financial community. Phishing is a tactic whereby an attacker tricks a victim to disclose personal information into a website. Often the website mimics the look and feel of the phished brand to trick the victim into entering his or her personal information. Phishing quickly became very popular with a peak period in 2009 and2010, when this attack was very successful. After that time, more and more phishing attacks moved toward a more targeted approach. This had a lot to do with many financial institutions implementing two-factor authentication, whereby phished information (such as the one-time password) has a very limited lifespan. But the fraudsters evolved too. I remember the case of a financial institution in Europe in 2008 finding out that there was a phishing site asking users to provide their phone numbers. This bank had implemented transactional two-factor authentication tokens and wondered what this is all about. This was until they found out that the fraudsters would ring up the victims pretending to call from the bank and trick them to disclose their one-time-password from their two-factor authentication devices. More recently, malware such as Dyre includes phishing components as part of their social-engineered attacks, where they combine malware infections with phishing sites. In the end, phishing means stealing personal information from you and there are hundreds of different ways to do it. Only a fool believes that if we mitigate against a successful attack vector that the fraud will stop. It will move and evolve, which is exactly what we are seeing with phishing.

Targeted Phishing Attack The data we’ll be looking at in this report is from a targeted phishing attack. This phishing attack included many layers to hide the origin of the attack. It also included data encryption and moving it around various C2 servers. But as mentioned in the last paragraph, in the end what mattered was the information that the fraudsters have been able to collect. And it was a lot. Below is the information that we found in the data set. It is comprised of 22 individual attributes.

Page 5

WHITEPAP E R



Username (Username of the online application)



Password



Description (This field was empty for all entries)



Home phone



Mobile Phone



ATM Pin



Tel Pin



Driver’s License (DL)



Date of Birth (DOB)



Mothers Maiden Name (MMN)



Social Security Number (SSN)



Full Name



Secret Question 1



Secret Answer 1



Secret Question 2



Secret Answer 2



Secret Question 3



Secret Answer 3



Secret Question 4



Secret Answer 4



Secret Question 5



Secret Answer 5



The IP address of the victim (IP)



Browser String (The User Agent of the victim)

From this raw information, we "enriched" this by adding the following attributes: 

Browser Type

Page 6

WHITE P AP E R



Browser Name



Browser Version



OS Version



IP Country



IP ISP



IP Region



IP City



IP Latitude



IP Longitude



IP First Seen (TMX)



Trust Tags (TMX)



IP Score (TMX)

How to remove obviously fake entries The task was to remove all obviously fake entries from this database, something we thought would involve a lot of manual work but it turns out that this is surprisingly easy and fully automatic. First of all,as a fraudster, you have to be ready to be abused. The most common name was "F... you." It is also amazing how many people think that they can hunt a fraudster down. Just three rules (attached in Appendix A) eliminated 100 percent of the fake entries (!). These rules eliminated 83 percent of the entries – leaving 17 percent of the entries, which is still pretty high. Out of the remaining entries, we ran detailed manual checks of approximately 10 percent and they are all confirmed to be legitimate information (more on this later). We later found out that in more than 75 percent of the cases, we could use a public reverse telephone number search engine to confirm that all these entries are indeed valid. Two things are astonishing 1. It is very easy for a fraudster to "weed" out the fake information to focus on the real valuable information with just three simple rules. 2. Of the rest, we could establish very easily that 92 percent of the submitted information is genuine information, which is much higher than we ever thought it would be.

Page 7

WHITEPAP E R

A. It is very easy to ascertain that this is real information, which will be important if the fraudster doesn't intend to use the information, but to sell it in underground markets. Better quality data equals more money.

How the Internet makes it easy to "enrich" stolen PII with other available PII One of the assumptions that is quoted quite a bit is that once fraudsters have stolen some part of PII, it is easy for them to complement this with other sources to come up with a complete picture. In this section, we'll look into this claim a bit. Reverse Telephone Lookup Most of the phished users were from a particular country that was targeted and there are really nice websites available that allow you to search for people (like a telephone book or yellow pages). In many countries, there are services available that allow you do to a reverse phone number lookup. As the phished information contained the phone number, but not the address, we checked on how many entries we could find that had an associated entry with the telephone number.

Match rates In more than 92 percent of the returned information from the reverse phone lookup, the name matched the phished name. This, together with all the information above, confirms that all the remaining entries are "good," legitimate entries. This number is much higher than we anticipated and indicates that the quality of this phishing campaign is very high.

Page 8

WHITE P AP E R

Social Networking Sites There is this assumption that virtually all of our personal information has been leaked in one of the data breaches over the last couple of years. While this is certainly true, we were very interested to find out how much are people willingly sharing on social media sites and how easy it is to take one piece of data and "enrich" it with publicly available data. One particular "problem" we faced with this campaign is that many of the victims were not the millennials who engage heavily in social media. The average age of the (cleaned) dataset was 57. We still found many example of the power of social media sites and below are a few examples of this. Example 1 

Facebook: Through Facebook, we can confirm that the person exists.



LinkedIn: Provides the information that the person is a self-employed bookkeeper.



Airbnb reveals that

Page 9

o

The city matches the geolocation of the IP address from the victim

o

The last two digits of the telephone number (which matches, too)

o

The partner's name

o

A family picture

WHITEPAP E R



o

Date when the person joined Airbnb

o

Description of the household

o

Confirmation of the job from LinkedIn

Another apartment rental website o

Confirms all of the above, inclusive of the telephone number in clear.

o

Tells me the address, which matches our records

Example 2 

Twitter: Location from Twitter matches the geolocation of the IP address



Dating Site: Date of birth matches the date of birth of the phished credentials

Example 3 

Facebook: City on Facebook matches the city from the geolocation of the IP address



Telephone book: confirms the name and the address

Page 10

WHITE P AP E R

Example 4 

MeetMe: searching for the name confirms o

The city (match with the geolocation of the IP address)

o

The age and date of birth

Example 5 

Government Agency o

Name, job title, employer, age and date of birth confirmed by publicly available CV

o

Salary is publicly disclosed too

Data How accurate is the Geolocation of IP addresses? The dataset provided us a unique opportunity to see the value of IP addresses. The original dataset did not include a postal address, only an IP address. Through external services (social media sites, reverse telephone lookups), we've been able to enrich the data with the postal address. 

Armed with the IP address, we can now use geolocation to get the latitude and longitude coordinates (plenty of providers available).

Page 11

WHITEPAP E R



Armed with the postal address, we can resolve as well the latitude and longitude coordinates of the postal address (plenty of services available).

Now we can compare these two and the geolocation of the IP address is surprisingly accurate. For the data below, we removed less than 3 percent of outliers where the geolocation was completely different (e.g. on a different continent). The average distance between the geolocation of the IP address and the geolocation of the mailing address is just 63 miles.

In more than half of the data sets (54%), the distance between the geolocation of the IP address and the geolocation of the mailing address was less than 10 miles!

The distribution is

Page 12

WHITE P AP E R

Operating System The operating system of the victims’ computers isn’t really surprising, with Windows holding 60 percent.

More interesting is the difference between desktop and mobile and it is quite surprising that the amount of victims on the mobile platform is almost 25%!

Page 13

WHITEPAP E R

Password stats (strength...) There is no better introduction to this section other than https://xkcd.com/936/. “Through 20 years of effort, we’ve successfully trained everyone to use passwords that are hard for humans to remember, but easy for computers to guess.”

Page 14

WHITE P AP E R

With the above being said, we took the liberty in making high level assessments in regards to the passwords used by the phishing victims. Common phrases used The first thing we wanted to check was whether common passphrases are used. We used a very simple list of only 7,187 common passwords (attached in Appendix B) and found that 7 percent of the passwords were based on common swear words.

A quick Google search reveals hundreds of sites that provide password lists containing millions of common passwords... Password Strength Having set the scene with the xkcd, we tried to evaluate the password strength by calculating the entropy for the password (in bits). The number of bits listed for entropy is an estimate based on letter pair combinations in the English language. We then categorize the passwords into 

Very weak (< 28 bits)



Weak (28 - 35 bits)



Reasonable (36 - 59 bits)



Strong (60 - 127 bits)



Very strong (> 128 bits)

Page 15

WHITEPAP E R

Anything less than 60 bits shouldn't be used for anything serious (such as online banking) The results are devastating: 

Not a single password was "very strong" as per this definition



45% of the passwords chosen were "very weak" o



With a typical desktop PC, a "very weak" password can be cracked in less than 10 minutes (often just seconds).

More than 98% of the passwords used fall under the category of "shouldn't be used for anything serious"

Password Length The vast majority (80%) had less than 10 characters with eight characters being the most commonly used password length.

Page 16

WHITE P AP E R

For an 8-digit password, there are 6.63 quadrillion possibilities - which sounds like a lot, however, with the advent of dedicated password cracking hardware, 8 digit passwords can be cracked in less than 6 hours! (see http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-standard-windowspassword-in-6-hours/) Password distribution The most commonly used password was… “password” (big surprise there). However, looking at the distribution, more than 98 percent of the passwords were unique, so there wasn’t much overlap in terms of commonly used passwords. Demography Gender From a gender point of view, there was virtually no difference between male and female.

Page 17

WHITEPAP E R

Age The youngest victim was 15, the oldest 90 with the average age being 57.

Conclusion The data analytics exercise presented in this report shows that phishing attacks in 2015 are still highly effective and that targeted phishing attacks produce high quality results. The trend that phishing is used with sophisticated malware (such as Dyre) is certainly a trend that we see will continue. The other trend is that more complex technology being deployed to the customer base (such as two-factor

Page 18

WHITE P AP E R

authentication) actually opens up an opportunity for cybercriminals by leveraging social-engineered attacks. One great example of this was the aforementioned case where phishing sites were capturing the telephone numbers of victims where the fraudsters rang up the victims to trick them into revealing the two-factor authentication code.

Appendix A The content in Appendix A is not available in the public version. Please request a private version of the ThreatMetrix Labs report at [email protected].

Appendix B The content in Appendix B is not available in the public version. Please request a private version of the ThreatMetrix Labs report at [email protected].

Appendix C The content in Appendix C is not available in the public version. Please request a private version of the ThreatMetrix Labs report at [email protected].

More Information For more information on this report please contact [email protected].

© 2015 ThreatMetrix. All rights reserved. ThreatMetrix, ThreatMetrix Labs, and the ThreatMetrix logo are trademarks or registered trademarks of ThreatMetrix in the United States and other countries. All other brand, service or product names are trademarks or registered trademarks of their respective companies or owners.

Page 19

Suggest Documents