Our Data Online
Over 2.5 quintillion bytes of data are created every day and with each passing moment more
devices are introduced online. The Internet of Things promises to further connect every part of our
world, producing even more data ripe for analytics. Where this data is stored, who has access to it
and the types of information gathered produces both opportunity and risks.
In 2018, Facebook users across the world learned that large scale harvesting of personal data
without their consent had occurred, the main culprit in this instance was a company called
‘Cambridge Analytica’. Millions of users activity, such as what they ‘liked’ and their friend circles,
were passed onto the company which then used an algorithm to psychologically profile people based on their interactions. Using these
personality profiles, politically motivated advertisements were tailored and shown to specific users.
Unfortunately, Facebook are not alone in the large scale collection of personal data. Companies like Amazon and Google use our
shopping and browsing habits to ‘improve’ their services, YouTube and Instagram give recommendations based on profiles and search
history. Users are increasingly worried about this practice, 67 of 92 participants in the survey answered yes
when asked if they are concerned about their data being collected and stored. These concerns are not unwarranted as even with
anonymisation techniques, it is possible to infer many details when analysing data sets. In 2006, AOL released millions of search queries
made by 650 of their users. The company removed the IDs and IP addresses to allow researchers to study the information, it took only
a couple of days for them to identify individuals. In what could be seen as a breach of privacy, the US retailer
Target sent coupons for baby clothes to a customer’s daughter; using shopping data they correctly determined that she was pregnant.
Is the collection of user data always a bad thing? Not necessarily. A team of researchers from Google were able to track the spread of
influenza without the results of a single medical check-up and they could do this quicker than the CDC. A recent report estimated
that the US healthcare system could save hundreds of billions of dollars each year through better integration and analysis of medical
data, an overarching system that uses information gathered from clinical studies to smart devices could not only save money but also
lives. Interestingly, when survey participants were asked if they would use a blockchain application that offered
financial incentives for their data, 39% of respondents advised yes. So perhaps it is not that data collection is inherently bad, but that
users want more control over when and who they share their data with. The idea that you are rewarded for your information instead
of it being harvested and sold with no immediate personal benefit, is certainly a more attractive proposition.
Recent Breaches
In 2018 over 2000 data breaches were reported from more than 60 different countries, out of these
the healthcare industry suffered around 500 breaches. In 2019 an estimated 41 million healthcare
records were exposed, stolen or illegally disclosed and the average cost of a data breach was close to
$4 million. The cost of these data breaches is steadily increasing over time, gaining 12% from 2014 to 2019. Throughout 2020, at least 8 billion records containing sensitive information have been exposed online making it one of the worst years in data breach history.
Personal data breaches from organisations enables mass identity fraud and the risk grows every day. The information leaked is often
distributed online, accumulating in the hands of criminals and causing an erosion of privacy. Vulnerable individuals are often targeted
repeatedly, resulting in a profound loss in quality of life.
Equifax is a large, top tier credit reporting agency. In 2017, they released statements acknowledging that it was the victim of a
cyberattack where some 148 million citizens personal data was compromised. This information included names, dates of birth, driving
licenses and even credit card numbers. Data stored by Equifax is not an opt-in system, the information is sourced from businesses and
institutions and can be very comprehensive. The attack made use of a vulnerability called Apache Struts CVE-2017-5638 which allows
for remote command execution. On March 7th 2017, the Apache Software Foundation published a patch to fix the issue and on March 8th the Department of Homeland Security notified Equifax, along with other credit agencies, directing them to install the patch. A week after the company was notified of the patch, Equifax conducted a scan of their system and the subsequent report failed to highlight any vulnerability to the Apache Struts bug; this left the systems unpatched and unprotected up until late July 2017.
During this period Equifax noticed suspicious activity within their systems and therefore took the
application offline while hiring an external cybersecurity firm to conduct forensic analysis, this
investigation disclosed that many files had been breached.
The situation was further complicated when the company attempted to address the issue. To help
disseminate details of the breach to affected users, Equifax created a separate domain and webpage.
Almost immediately fake settlement and informational sites were created to exploit the situation,
resulting in further opportunity for criminals.
The U.S bank Capitol One was subject to a large security incident in 2019, they are the fifth largest consumer bank in America with
strong investments in IT infrastructure (one of the first banks in the world to migrate their datacentres to the cloud). Details of the leak
showed that names, addresses, phone numbers and income details were amongst some of the data subject to unauthorised access.
The breach affected approximately 100 million consumers and small businesses across the U.S and Canada. Interestingly, this breach
was discovered via their responsible disclosure program when an email from an outsider informed them that their customers data was
available on a GitHub page.
As a result of FBI investigations, it was discovered that a woman named Paige A. Thompson was singlehandedly responsible for the
breach, she was later accused of stealing data from over 30 different companies. Thompson created a scanning software tool that was
able to check cloud based servers and identify misconfigured firewalls, enabling the execution of commands remotely and therefore
gaining access to the servers. The FBI identified a script hosted on GitHub that with only 3 commands allowed unauthorised access to
servers hosted by Amazon.
Highlighted via the survey, 28 respondents confirmed that their personal information had been leaked
and 18% of individuals questioned advised that they had experienced identity fraud. Worryingly only
38% of participants know how to check if their information has been compromised which reveals the
growing challenge of managing personal data online.
Challenges
It is becoming increasingly difficult to keep track of where our data is being used and who has access to it. We are often trusting many
different services to keep sensitive information safe and apply good security practices. However as shown by the recent breaches, this
is not always the case and it is not unusual for companies to take a relaxed approach to security. Simple misconfiguration of firewalls
and server software, along with a lack of regular updates and failure to remain diligent is commonplace. Despite increased investment
in cybersecurity solutions, organisations are still suffering from major breaches. The consequences go far beyond financial settlements
and loss of reputation, as more personal data is leaked it becomes extremely difficult to ensure that institutions are actually
communicating with the true users of their service and not an attacker. According to Verizon’s security research, more than a quarter
of security incidents went unnoticed for many months with some remaining vulnerable for over a year.
The World Economic Forum rates cyber security breaches as one of the five most serious risks facing the globe today. Increasing
complexity and sophistication of attacks, profitability of obtaining personal data, lack of trained cybersecurity analysts, the increase in
remote working and the use of cyber warfare by state sponsored actors are serious concerns for organisations. Consistent evaluation
of security controls is vital but costly to implement, it can be difficult for small businesses to effectively monitor the wide scope of the
cybersecurity landscape. That being said, research shows that 97% of attacks can be mitigated if organisations implemented effective
controls. Within healthcare, the biggest threat of data breaches comes from internal sources. As of 2017
46% of security incidents were caused by employee behaviour such as clicking on malicious links, worker negligence and access abuse.
Computer training is mandatory for NHS staff, yet it is documented that only 12% of trusts have fulfilled their training obligations.
Chronic underinvestment in healthcare related IT services is stark in comparison to other industries, with as little as 1-2% of annual
budgets being spent on IT services when 4-10% can be seen in other business sectors.
As the world pushes forward with the Internet of Things, securing each and every device is a daunting challenge. The IoT is diverse and
can include devices that are very different from traditional computers, often these devices are deployed on a large scale and users are
unaware of the potential threats. These devices can regularly be seen dotted around homes and businesses, always turned on and
listening. The interconnected future poses significant problems for our privacy and security will not just be the job of organisations,
individuals must adapt and implement solid security practices themselves.