Privacy-invasive software

Privacy-invasive software is a category of computer software that ignores users’ privacy and that is distributed with a specific intent, often of a commercial nature. Three typical examples of privacy-invasive software are adware, spyware and content hijacking programs.

Background

In a computerized setting, such as the Internet, there is a wide variety of privacy threats to consider. Threats vary from the systematic capture of everyday events (e.g., what online sites that are visited or what items that are purchased from online stores) to the mass marketing based on the retrieval of personal information (spam offers and telemarketing calls are more common than ever) to the distribution of information on lethal technologies used for, e.g., acts of terror.

Today, software-based privacy-invasions occur in numerous aspects of Internet usage. Spyware programs set to collect and distribute user information secretly download and execute on users’ workstations. Adware displays advertisements and other commercial content often based upon personal information retrieved by spyware programs. System monitors record various actions on computer systems. Keyloggers record users’ keystrokes in order to monitor user behavior. Self-replicating malware downloads and spreads disorder in systems and networks. Data-harvesting software programmed to gather e-mail addresses have become conventional features of the Internet, which among other things results in that spam e-mail messages fill networks and computers with unsolicited commercial content. With those threats in mind, we hereby define privacy-invasive software as:

Definition

“ Privacy-invasive software is a category of software that ignores users’ right to be let alone and that is distributed with a specific intent, often of a commercial nature, which negatively affect[s] its users. ”

In this context, ignoring users’ right to be let alone means that the software is unsolicited and that it does not permit users to determine for themselves when, how and to what extent personally identifiable data is gathered, stored or processed by the software. Distributed means that it has entered the computer systems of users from (often unknown) servers placed on the Internet infrastructure. Often of a commercial nature means that the software (regardless of type or quality) is used as a tool in some sort of a commercial plan to gain revenues.

Problem with the spyware concept

In early 2000, Steve Gibson formulated the first description of spyware after realizing software, that stole his personal information, had been installed on his computer (Gibson Research Corporation). His definition reads as follows:
“ Spyware is any software which employs a user’s Internet connection in the background (the so-called ‘backchannel’) without their knowledge or explicit permission. ”

This definition was valid in the beginning of the spyware evolution. However, as the spyware concept evolved over the years it attracted new kinds of behaviours. As these behaviours grew both in number and in diversity, the term spyware became hollowed out. This evolution resulted in that a great number of synonyms sprang up, e.g. thiefware, evilware, scumware, trackware, and badware. We believe that the lack of a single standard definition of spyware depends on the diversity in all these different views on what really should be included, or as Aaron Weiss put it (Weiss 2005):
“ What the old-school intruders have going for them is that they are relatively straightforward to define. Spyware, in its broadest sense, is harder to pin down. Yet many feel, as the late Supreme Court Justice Potter Stewart once said, ‘I know it when I see it.’. ”

Despite this vague comprehension of the essence in spyware, all descriptions includes two central aspects. The degree of associated user consent, and the level of negative impact they impair on the user and their computer system (further discussed in Section 2.3 and Section 2.5 in (Boldt 2007a)). Because of the diffuse understanding in the spyware concept, recent attempts to define it has been forced into compromises. The Anti-Spyware Coalition (ASC) which is constituted by public interest groups, trade associations, and anti-spyware companies, have come to the conclusion that the term spyware should be used at two different abstraction levels (Anti-Spyware Coalition). At the low level they use the following definition, which is similar to Steve Gibson’s original one:
“ In its narrow sense, Spyware is a term for tracking software deployed without adequate notice, consent, or control for the user. ”

However, since this definition does not capture all the different types of spyware available they also provide a wider definition, which is more abstract in its appearance:
“ "In its broader sense, spyware is used as a synonym for what the ASC calls ‘Spyware (and Other Potentially Unwanted Technologies)’. Technologies deployed without appropriate user consent and/or implemented in ways that impair user control over:

1) Material changes that affect their user experience, privacy, or system security;
2) Use of their system resources, including what programs are installed on their computers; and/or
3) Collection, use, and distribution of their personal or other sensitive information."


Difficulties in defining spyware, forced the ASC to define what they call Spyware (and Other Potentially Unwanted Technologies) instead. In this term they include any software that does not have the users’ appropriate consent for running on their computers. Another group that has tried to define spyware is a group called StopBadware.org, which consists of actors such as Harvard Law School, Oxford University, Google, Lenovo, and Sun Microsystems (StopBadware.org). Their result is that they does not use the term spyware at all, but instead introduce the term badware. Their definition thereof span over seven pages, but the essence looks as follows (StopBadware.org Guidelines):
“ "An application is badware in one of two cases:

1) If the application acts deceptively or irreversibly.
2) If the application engages in potentially objectionable behaviour without: first, prominently disclosing to the user that it will engage in such behaviour, in clear and non-technical language, and then obtaining the user's affirmative consent to that aspect of the application."


Both definitions from ASC and StopBadware.org show the difficulty with defining spyware. We therefore regard the term spyware at two different abstraction levels. On the lower level it can be defined according to Steve Gibsons original definition. However, in its broader and in a more abstract sense the term spyware is hard to properly define, as concluded above.

Introducing privacy-invasive software

A joint conclusion is that it is important, for both software vendors and users, that a clear separation between acceptable and unacceptable software behaviour is established (Bruce 2005)(Sipior 2005). The reason for this is the subjective nature of many spyware programs included, which result in inconsistencies between different users beliefs, i.e. what one user regards as legitimate software could be regarded as a spyware by others. As the spyware concept came to include increasingly more types of programs, the term got hollowed out, resulting in several synonyms, such as trackware, evilware and badware, all negatively emotive. We therefore choose to introduce the term privacy-invasive software to encapsulate all such software. We believe this term to be more descriptive than other synonyms without having as negative connotation. Even if we use the word invasive to describe such software, we believe that an invasion of privacy can be both desired and beneficial for the user as long as it is fully transparent, e.g. when implementing specially user-tailored services or when including personalization features in software.
The three-by-three matrix classification of privacy-invasive software showing legitimate, spyware and malicious software.
The three-by-three matrix classification of privacy-invasive software showing legitimate, spyware and malicious software.

We used the work by Warkentins et al. (described in Section 7.3.1 in (Boldt 2007a)) as a starting point when developing a classification of privacy-invasive software, where we classify privacy-invasive software as a combination between user consent and direct negative consequences. User consent is specified as either low, medium or high, while the degree of direct negative consequences span between tolerable, moderate, and severe. This classification allows us to first make a distinction between legitimate software and spyware, and secondly between spyware and malicious software. All software that has a low user consent, or which impairs severe direct negative consequences should be regarded as malware. While, on the other hand, any software that has high user consent, and which results in tolerable direct negative consequences should be regarded as legitimate software. By this follows that spyware constitutes the remaining group of software, i.e. those that have medium user consent or which impair moderate direct negative consequences. This classification is described in further detail in Chapter 7 in (Boldt 2007a).

In addition to the direct negative consequences, we also introduce indirect negative consequences. By doing so our classification distinguishes between any negative behaviour a program has been designed to carry out (direct negative consequences) and security threats introduced by just having that software executing on the system (indirect negative consequences). One example of an indirect negative consequence is the exploitation risk of software vulnerabilities in programs that execute on users’ systems without their knowledge (Saroiu 2004).

Comparison to malware

The term privacy-invasive software is motivated in that software types such as adware and spyware are essentially often defined according to their actions instead of their distribution mechanisms (as with most malware definitions, which also rarely correspond to motives of, e.g., business and commerce). The overall intention with the concept of privacy-invasive software is consequently to convey the commercial aspect of unwanted software contamination. The threats of privacy-invasive software consequently do not find their roots in totalitarianism, malice or political ideas, but rather in the free market, advanced technology and the unbridled exchange of electronic information. By the inclusion of purpose in its definition, the term privacy-invasive software is a contribution to the research community of privacy and security.

Internet goes commercial

In the mid-1990s, the development of the Internet increased rapidly due to the interest from the general public. One important factor behind this accelerating increase was the 1993 release of the first browser, called Mosaic (Andreessen 1993). This marked the birth of the graphically visible part of the Internet known as the World Wide Web (WWW). Commercial interests became well aware of the potential offered by the WWW in terms of electronic commerce, and soon companies selling goods over the Internet emerged, i.e. pioneers such as book dealer Amazon.com and CD retailer CDNOW.com, which both were founded in 1994 (Rosenberg 2004).

During the following years, personal computers and broadband connections to the Internet became more commonplace. Also, the increased use of the Internet resulted in that e-commerce transactions involved considerable amounts of money (Abhijit 2002). As competition over customers intensified, some e-commerce companies turned to questionable methods in their battle to entice customers into completing transactions with them (CDT 2006) and (Shukla 2005). This opened ways for illegitimate actors to gain revenues by stretching the limits used with methods for collecting personal information and for propagating commercial advertisements. Buying such services allowed for some e-commerce companies to get an advantage over their competitors, e.g. by using advertisements based on unsolicited commercial messages (also known as spam) (Jacobsson 2004).

Commercially motivated adverse software

The use of questionable techniques, such as Spam, were not as destructive as the more traditional malicious techniques, e.g. computer viruses or trojan horses. Compared to such malicious techniques the new ones differed in two fundamental ways. First, they were not necessarily illegal, and secondly, their main goal was gaining money instead of creating publicity for the creator by reaping digital havoc. Therefore, these techniques grouped as a “grey” area next to the already existing “dark” side of the Internet.

Behind this development stood advertisers that understood that Internet was a “merchant’s utopia”, offering huge potential in global advertising coverage at a relatively low cost. By using the Internet as a global notice board, e-commerce companies could market their products through advertising agencies that delivered online ads to the masses. In 2004, online advertisement yearly represented between $500 million and $2 billion markets, which in 2005 increased to well over $6 billion-a-year (McFedries 2005) and (Zhang 2005)]. The larger online advertising companies report annual revenues in excess of $50 million each (CNET 2005). In the beginning of this development such companies distributed their ads in a broadcast-like manner, i.e. they were not streamlined towards individual users’ interests. Some of these ads were served directly on Web sites as banner ads, but dedicated programs, called adware, soon emerged. Adware were used to display ads through pop-up windows without depending on any Internet access or Web pages.

The birth of spyware

In the search for more effective advertising strategies, these companies soon discovered the potential in ads that were targeted towards user interests. Once targeted online ads started to appear, the development took an unfortunate turn. Now, some advertisers developed software that became known as spyware, collecting users’ personal interests, e.g. through their browsing habits. Over the coming years spyware would evolve into a significant new threat to Internet-connected computers, bringing along reduced system performance and security. The information gathered by spyware were used for constructing user profiles, including personal interests, detailing what users could be persuaded to buy. The introduction of online advertisements also opened a new way to fund software development by having the software display advertisements to its users. By doing so the software developer could offer their software “free of charge”, since they were paid by the advertising agency. Unfortunately, many users did not understand the difference between “free of charge” and a “free gift”, where difference is that a free gift is given without any expectations of future compensation, while something provided free of charge expects something in return. A dental examination that is provided free of charge at a dentist school is not a free gift. The school expects gained training value and as a consequence the customer suffers increased risks. As adware were combined with spyware, this became a problem for computer users. When downloading software described as “free of charge” the users had no reason to suspect that it would report on for instance their Internet usage, so that presented advertisements could be targeted towards their interests.

Some users probably would have accepted to communicate their browsing habits because of the positive feedback, e.g. “offers” relevant to their interests. However, the fundamental problem was that users were not properly informed about neither the occurrence nor the extent of such monitoring, and hence were not given a chance to decide on whether to participate or not. As advertisements became targeted, the borders between adware and spyware started to dissolve, combining both these programs into a single one, that both monitored users and delivered targeted ads. The fierce competition soon drove advertisers to further “enhance” the ways used for serving their ads, e.g. replacing user-requested content with sponsored messages instead, before showing it to the users.

The arms-race between spyware vendors

As the chase for faster financial gains intensified, several competing advertisers turned to use even more illegitimate methods in an attempt to stay ahead of their competitors. This accelerated the whole situation and pushed the “grey” area of the Internet closer and closer to the “dark” side (G?rling 2004). During this development users experienced infections from unsolicited software that crashed their computers by accident, uninvitedly changed application settings, harvested personal information, and deteriorated their computer experience through spam and pop-up ads (Pew 2005). Over time these problems lead to the introduction of countermeasures in the form of anti-spyware tools. These tools supported users in cleaning their computers from spyware, adware, and any other type of shady software located in that same “grey” area. As these tools were designed in the same way as anti-malware tools, such as anti-virus programs, they could only identify spyware that were already known, leaving previously unknown spyware undetected. To further aggravate the situation, a few especially illegitimate companies distributed fake anti-spyware tools in their search for a larger piece of the online advertising market. These fake tools claimed to remove spyware, but instead installed their own share of adware and spyware on unwitting users’ computers. Sometimes even accompanied by the functionality to remove adware and spyware from competing vendors.

New spyware programs are being added to the setting in what seams to be a never-ending stream, although the increase has levelled out somewhat over the last years. However, there still does not exist any consensus on a common spyware definition or classification, which we believe negatively affect the accuracy of anti-spyware tools, further rendering in that spyware programs are being undetected on users’ computers (Good et al. 2006) and (MTL 2006). Developers of anti-spyware programs officially state that the fight against spyware is more complicated than the fight against viruses, trojan horses, and worms (Webroot 2006). We believe the first step for turning this development in favour for both users and anti-spyware vendors, is to create a standard classification of spyware. Once such a classification exists anti-spyware vendors can make a more clear separation between legitimate and illegitimate software, which result in more accurate countermeasures.

Predicted future development

There are several trends integrating computers and software into people’s daily lives. One example is traditional media-oriented products which are being integrated into a single device, called media centres. These media centres include the same functionality as conventional television, DVD-players, and stereo equipment, but combined with an Internet connected computer. In a foreseeable future these media centres are anticipated to reach vast consumer impact (CES) (Newman 2006). In this setting, spyware could monitor and surveillance for instance what television channels are being watched, when/why users change channel or what DVD movies users have purchased and watch. This is information that is highly attractive for any advertising or media-oriented corporation to obtain. This presents us with a probable scenario where spyware is tailored towards these new platforms; the technology needed is to a large extent the same as is used in spyware today.

Another interesting area for spyware vendors is the increasing amount of mobile devices being shipped. Distributors of advertisements have already turned their eyes to these devices. So far this development have not utilized the geographic position data stored in these devices. However, during the time of this writing companies are working on GPS-guided ads and coupons destined for mobile phones and hand-held devices (Business 2.0 Magazine). In other words, development of location-based marketing that allow advertising companies to get access to personal geographical data so that they can serve geographically dependant ads and coupons to their customers. Once such geographic data is being harvested and correlated with already accumulated personal information, another privacy barrier has been crossed.


This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.