Big Data is Your Data – Your Helper/Spy in the Cloud

Introduction

Over the past few weeks, I’ve observed that whenever I search for things on the Internet, there’s a whole lots of advertisements which focus on the thing I’ve search for, and on a range of site. This seems quite worrying, as your profile and search information is normally constrained within a certain site, and you develop a trust relationship with them. So forget the NSA and the threats around national intelligence agencies spying on your network traces, the real focus for snooping on our lives are from those trying to understand who we live, and have created a whole business model around mining data around the user. So it’s companies such as LinkedIn, Amazon, Facebook and Google that are the ones who try and analyse the things we like, and what we buy, and it’s the payback that we must make in order to get their “free” services. So it is “Online Behavioral Advertisements” such as AdChoices, which is the most that is the most extremely version of using your profile and Web searches to push content to you on the Web sites that we connect to. The quote from the advertising material says:

Better ads and offers. With interest-based advertising, you get ads that are more interesting, 
relevant, and useful to you. Those relevant ads improve the online experience and help users 
find the things that interest them more easily.

But there’s a whole debate in here about the privacy of our profiles, and our ability to hide our tracks when we want. The statement is a dream for marketing departments, but is a Big Brother scenario for others. If you were asked whether you wanted this targeting, then there’s no argument, if not, there’s a problem here.

If you want to find you who is watching you and customising adverts for you, go to: http://www.aboutads.info/choices/. I quick check shows that I have 88 companies who are watching aiming their target on me. The privacy hints to the ability to gather a whole range of information on the user:

-Personal information you knowingly choose to disclose to us such as your name, mailing 
address, and email address. You may provide this information when you make requests for 
information or assistance.
-Non-personal information including but not limited to browser type, IP address, operating 
system, the date and time of a visit, the pages visited on this Site, the time spent viewing 
the Site, and return visits to the Site.
- We may also collect aggregated information as you and others browse our Site.

The marketing person’s dream

Overall companies are getting in understanding how we behave and the targeting more focused, as advertises want to target their products exactly to the customers who want their product. Never before has this opportunity existed, where key demographics can be targeted so specifically. In the past an advertiser selling chocolate would define that they might want to target a female audience, within an age of 18-35, and ones that had families, so they would feed their products through the media on TV programs and print media that had a high percentage of that type of demographic. Unfortunately, much of the time, they would miss their demographic, as it fell outside that range, and also that even if they were in this range, there’s no guarantee that they would actually like chocolate, or considered it to be something that they would consider purchasing. And so the Web companies have tried a few times to really try and understand us, and target.

When our society started to exchange good for money, it become obvious that they would made their products could go and sell them, but the problem comes when they wanted to sell in other regions, they would require people to do this, and can take them away from what they did best: actually making the product. So our commerce infrastructure provides us with seller who will take products and then sell them to the end customer. The seller then has the opportunity to promote certain good for some favorable commission, and both the provider and the seller benefit. Often too a product can be promoted by a third party: a marketeer, whose function is to lead the customer to the seller. Again a commissions is paid, based on the evidence that they found the customer, every time the customer is lead to the seller. Obviously the marketeer could just promote the seller, and lots of customers could come to shop, and not actually purchase anything, so there could be a “finders” fee for the number of customers that they bring to a store, but a high fee would be paid for a “finder and sale” fee, where a customer actually goes ahead and purchases something.

On the Internet, this has become a major industry. In the past we have seen major advertising campaigns on TV and print media, and we can often spot the intention of companies to target us, and get us into stores, and make purchases. On the Internet, the targeting is razor sharp, and the link between the product and the customer is now serviced by a whole range of stakeholders, including: Transaction verifiers, Brand Monitors, Web-traffic analytics, Affiliate Platforms and Campaign Verification (Figure 1).

aff

Figure 1: Affiliate marketing

Pay per click or per purchase?

For users the mining of the data is generally fine, and Facebook, and many other Internet-focused companies, especially Google and Amazon, extensively mine our data and try to make sense of it. For this symbiotic relationship, the Internet companies give us something that we want, such as free email, or the opportunity to distribute our messages. What was strange about this one is that the Facebook users were been treated in the same way as rats in a laboratory, and had no idea that they were involved in the experiment. On the other hand it is not that much different from the way that affiliate networks have been created, and which analyse the user, and try to push content from an affiliate of the network, and then monitor the response from the user (Figure 1).

We increasingly see adverting in our Web page accesses, where the user is matched to their profile though a cookie, and where digital marketing agencies and affiliate marketing companies try and match an advertising to our profile. They then monitor the success of the advertising using analytics such as:

  • Dwell time. This type of metric is used to find out how keen the user has been before clicking the content.
  • Click-through. This records the click-through rate on content. An affiliate publisher will often be paid for click-throughs on advertising material. This can lead to click-through scams, where users a paid to click on advertising content on a page.
  • Purchases. This records the complete process of clicking-though, and the user actually purchasing something. This is the best level of success, and can lead to higher levels of income, and in some cases to share a percentage of the purchase price. Again this type of metric can lead to fraud activity, where a fraudster will use stolen credit card details to purchase a high price item through a fraudulent Web site, and use this to gain commission from an on-line purchase (which is then traced to be fraudulent at a future time).

The targeting has been a slow process of evolution, where advertisements are placed within Web pages and which are often untargeted, which is similar to defining where you would to advertise in the print media, and then placing your advertisement there. Normally the marketing team would have a strategy, and then define which Web sites best match their demographic, and how likely they would be to purchase from that site, and place advertisement on the Web site. We can see in Figure 2 that the advertisements often seamlessly integrate into Web sites, and can fit themselves into whatever area is defined for them. In this way a Web designer can live a space on the page, know that it is be filled-in later. The advertisers normally even give some customisation to make sure it fits in with the general layout of the page.

Screen Shot 2014-08-17 at 08.37.20

Figure 2: Integrated advertising

This type of marketing is fairly untargeted, and the user accepts it, as they are not being tracked in any way for the advertisement, and is similar to the way that a newspaper would have advertisements, where the user does not need to read it.

100% Target

The other type of approach, which doesn’t scare the user into thinking they are being tracked, is to place the paid links (per click or purchase) through the search results (Figure 3). Many users have had their Web search page redirected, such as with Conduit and with a whole range of re-directors embedded into freeware software. It is thus an excellent opportunity to advertise products, without the user knowing. For Google they have, at least, marked the advertising links. Unfortunately for Google (and the marketing companies) users tend not to click on the paid links. So another method had to be found, and that choice is AdChoice.

Screen Shot 2014-08-17 at 08.09.44.fw

Figure 3: Promoted links

We all have seen the benefit from Amazon, where we are recommended products that we have previously purchased, or have been looking at. It basically allows Amazon to provide a better services, and often jogs our memory. This though is done with the user’s content, as they log into the site, and the site have gained the trust to track the user and build up a profile on them. What scares users is when they purchase something from one site, and find out that they are being targeted on another site with something that they bought. This is cross-pollination of user profiling. Obviously this can happen where one company sells on the profile data of their users, and then this is mined for their interested, but there is now a much more targeted operation going on – AdSense- and it is one which cross pollenates.

So have you noticed that you have looked for a new hard disk on a site, and then a few days later, you see an advertisement from your news site, with an advertisement for a disk that exactly fits your interests. This is AdChoices working in the background, and analysing your profile, and feeding targeted adverts to you, as they know that users often surf for ideas, and don’t purchase straight away, so the adverts become jog points for your memory.

Figure 4 shows an example. Over the past few days I’ve had problems with Office 365 on my Mac desktop, and I’ve searched around the Web looking for fixes. Today when I recall a page from an online newspaper, it integrates an Office 365 link. In fact wherever I go, it seems to think that Office 365 is the most important thing on my mind. Previous to this I had been considering purchasing a Microsoft Surface Pro 3, and often the adverts I’ve been pushed are focused on this product. Unfortunately I got sick of seeing these adverts, especially as the ones pushed to me had an animation which highlighted the product range, and that look like some of the nasty adverts of the past (such as You are the Millionth User of this Site), and I went and bought an Android device, instead.

Screen Shot 2014-08-17 at 09.17.05

Figure 4: AdChoices

If you look back at Figure 2 (for the Celtic FC site), you’ll find a Network Monitoring advertisement. This appears as I’ve been doing some research on visualising network traffic and log files, so Google has pushed a networking monitoring package, so it’s assumed that I’m in the market for some networking monitoring tools. Having spoken to many other people, they too are observing that searching to buy beds on-line, will cause a whole lot of adverts for beds, on sites that have nothing to do with this. Thus Web sites are becoming places that can push you products for something that they don’t even have on-show, and do not sell.

So it looks like the Internet has the perfect tool of the marketeer, as only Google can really know what we are doing. It’s a powerful system, and they guard against destroying user trust with the policy on matching as:

  • The types of websites you visit and the mobile apps you have on your device.
  • The DoubleClick cookie on your browser and the settings in your Ads Settings
  • The websites and apps you’ve visited that belong to businesses advertising with Google
  • Your previous interactions with Google’s ads or advertising services
  • Your Google profile, including YouTube and Google+ activity

Which basically says that whenever you using Google, such as for searching, watching online content and doing some social media thing with them, they are watching you, and trying to understand your likes. They then use a special cookie to then track you on their affiliate network, and do the magic of matching in the background. Notice too that mobile apps are used for gathering information, as these devices contain a whole range of information that defines our behaviour, including how often we search the Web through the day, and how this changes.

A key statement is “Your previous interactions with Google’s ads or advertising services”, which basically says that they are monitoring on our clicks and follow-throughs on your interaction with adverts. So just because the advert is there, and they you have an interest in the product, if you are not clicking on it, it’s a waste of space. So you need to watch which adverts you click on, as it puts a big tick in the box of your likes if you do. With Facebook you see the mining up front, but then get the opportunity to define that the advertising material is not quite what you want, but the future is towards an automated machine algorithm, and less user choice. The Web you’ll get, will be the one that your Web company has planned for you, and the concept of jump off one site to the next is going, as affiliates are creating networks where they can cross-pollenate your profile.

So it’s a very strange matching service that is going on, and it is one example of how companies aim to gather you tracks over the Internet, and understand how you live and what you like. In that way they can target just you. It’s a very fine line that these companies walk, as some companies who integrate into your browser as accused of spying on us, so Google better watch and tread carefully. For Google they have guarded against losing user trust with:

  • Not linking your name or personally identifiable information to your DoubleClick cookie without your consent.
  • Not associate your DoubleClick cookie with sensitive topics like race, religion, sexual orientation or health without your consent.

So they don’t give any your name, but that you are a target customer, and they don’t give away sensitive information about you, but obviously “sensitive” information could relate to your shopping habits too.

So, finally, let’s search for do some searching for “beds” to purchase, and access our horoscope. The result is shown in Figure 5, which seems to throw back your browsing history for all to see. So you can see that, on the Internet, that horoscopes can now even predict what you are likely to buy next. We really would require tunnel vision not to see the products recommending from this horoscope.

bed.fw

Figure 5: Your horoscope is personalised with a special interest in your preference for buying beds

Conclusions

Like it or not, you are interesting to a whole range of companies, and there has never been such an opportunity for them to learn everything about us … when we get up in the morning, what we have for breakfast, how we get to work, what type of software we us, what brands we like, … and so on. In the past companies used survey data to understand the audience, so that they could target certain advertising channels. Now Google knows whether you’re a sensitive soul – watching Great British Bake Off on YouTube – or like driving fast – with Top Gear repeats. So we are all being observed, and there’s a whole lot happening about you in the background, and it’s all patched up with a little cookie that is dropped on your machine. The EU tried to do something about this matching and the consent for around this, but it didn’t stick, as it was often a waste of time for users to understand how the cookie was being used. So marketeers love those cookies … and not the chocolate ones!

The Internet is almost predicting our needs before we have even thought about them. A search that I did for an Audi A4 warning light, resulted in adverts for new cars, which, I assume the car companies perhaps hope that we might be thinking of getting rid of our car on the first signs of a problem, or that they can be messages in our head that it’s time for a new car. Who knows, but one thing that is sure, is that Big Data gives companies the opportunity to get a 100% hit rate (or once they can properly understand our searches). So watch out when you go searching for pile ointment, as you may get a whole lot of adverts that you might not like in your online.

There’s questions that you need to ask yourself about the information that the Internet is gathering on you, and it isn’t just IP addresses, it is now trying to understand you, and how you live. Did you know, for example, that your Android phone actually tracks your location as you move, as stores it in the Cloud, and that it also happens with your iPhone? Google and Apple thus know exactly where you work, and how you get to work, and then when you go for lunch … and so on.

If the user knows about the targeting, and agree with it, then this is fine. Unfortunately few people seem to know this behavioral analysis is going. Along with this, the user must be given the chance to opt-out. Unfortunately the current system is flawed, as it is not possible to opt-out of all the advertising:

Screen Shot 2014-08-24 at 11.18.16.fw

Leave a comment