Friday, April 6, 2018

Google DoubleClick Mozilla essay (final)



There are many problems with web advertising in general, including annoying features like autoplay video ads and pop-ups and also problems like “click fraud” which matter to advertisers. But the ethical issues with them are the most important, including malware (like exploit kits) and also tracking ads that affects privacy. This essay will be focusing on the ethical issues with some of the kinds of ads that Google produces and the history behind them, and why Larry/Sergey didn’t consider them when buying DoubleClick for example. Also discussed is Mozilla and how they are involved (like in the Google/Mozilla search deal), including Brendan Eich who created JavaScript that eventually left Mozilla to found Brave. There is also the difficulty of solving these issues, which will also be discussed. Of course, advertising is not limited to the web and there are often many benefits and risks (like deceptive advertising) to advertising in general, most of which will not be discussed here.

The history of Google and its advertising will be discussed first. Google was founded in 1998 by Larry Page and Sergey Brin while at Stanford, and took VC funding from KP and other partners. Google was founded with the search engine (with the PageRank algorithm) as the first product, but later added products like Gmail. Eric Schmidt was bought in as CEO in 2001 and recently left but are still on the board. Google IPOed in 2004, using dual class stock for example.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads were displayed at the top of the search results (labelled as ads) and were relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was also taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so-called banner-blindness from these ads. Cookies were itself invented in Netscape in 1994, and the IETF group that developed RFC 2109 and 2965 already know that tracking with “third-party cookies” were a problem (and it was mentioned in these RFCs). Those attempts at IETF cookie standards ultimately failed partly because they were incompatible with current browsers, and led to RFC 6265 that is closer to how cookies are implemented in browsers today. It also led to W3C P3P which was famously implemented in IE6, which also of course failed (partly because it was too complex) and was removed from Windows 10 but was an attempt to get the tracking under control.

Google bought Urchin in 2005, turning it into Google Analytics. Urchin was founded in 1998. Initially its product was to analyze web server log files, with JavaScript tags being added in Urchin 4 (called “Urchin Traffic Monitor”). The hosted version based entirely on JavaScript that was created later was initially called “Urchin on Demand” and was introduced in 2004. Of course, the original software that was sold receive little attention once Google bought it and it became Google Analytics and it was discontinued in 2012.

One of the problems of ads is malware. Typically websites take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder to make enough revenue to support them and often websites even use multiple ad networks. Flash was famous for some exploits for example (especially as it got more complex), and these days in general NPAPI and ActiveX etc plug-ins are dying off. Java was even worse for example with zero-days breaking the sandbox being very frequent at one point, and many browsers limit plug-ins to Flash only these days. Of course, there are browser exploits too.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of [1]. There is a market for exploit kits in general, and zero days are particularly valuable (obviously because they are unpatched). The FBI was also famous for using Firefox zero-days to deanonymize Tor users.

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers in 2016 (discovered by Brian Baskin). Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place, and it illustrated some of the flaws discussed here. This exploit kits also hit sites like MSN, and it was one of the most popular exploit kit at one point.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved. I think AdSafe worked by creating a limited sandbox to prevent things like XSS attacks (using a special object and denying access to many other objects), and defining a subset of JavaScript that can be verified. Flash in ads was also common too though, and obviously this is harder to verify to ensure it is safe (Flash’s complexity is part of why it is gradually being phased out in browsers).

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier, and encourage consolidation of data for example. Most of the ad tracking is called “retargeting” and it is often based on cookies and JavaScript, and DoubleClick was one of the first to do it. All ads encourages consumption by definition, but tracking ads are particularly bad for these reasons.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially (which any websites can do), but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

According to [2], “In December 2008 Google added DoubleClick cookies to AdSense ads”, tying the DoubleClick cookie-based tracking (dating long before Google bought it) to AdSense. I assume that AdSense tracking probably did not exist before Google bought DoubleClick. Google Analytics added AdWords and AdSense support in 2009. In 2012, Google changed its privacy policy to allow data to be consolidated, which was also very controversial. In 2014, Google Analytics integrated with DoubleClick, allowing things like remarketing lists to be shared [13]. Remarketing lists are basically lists of website visitors that can be uniquely identified by things like cookies, and it is one of the ways of targeting ads to users. It can probably be assumed that sharing remarketing lists basically ties the tracking together. Sharing of Google Analytics remarketing lists with AdWords was introduced in 2015, along with linking of Google Analytics and AdWords “manager” accounts [3]. “Google Analytics 366” came in 2016, according to [5]. Remarketing lists for search ads was introduced in 2012 and was tied to Google Analytics in 2015 (though not all data from Google Analytics can be used). It allowed different search ads to be targeted to different visitors based on cookie-based tracking on websites (with sites using special tags for this purpose). For example, you can show different search ads to visitors that visit the site every day.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. This was mentioned during the Google/DoubleClick acquisition for example. Of course, some provides more control than others, such as AdChoices for example. AdChoices was an attempt at self-regulation for ad publishers, and used an icon to indicate that data was being collected. You can click the icon to display the privacy policy for the ads or opt-out of ad targeting. It was not the same as blocking ads completely though, and did not solve all of the problems of ads either. There was also an attempt at a Do-Not-Track HTTP header, which was probably too simple (and thus was also very vague in its meaning) and there was no guarantee that a site would comply either obviously since it was just an HTTP header (IE11 enabling it by default was also controversial and Win10 no longer does so by default).

Some of the problems with the opt-out methods are similar to the problems of a national “do not email” registry proposed in the US CAN-SPAM Act of 2003 for spam messages, and such lists to “opt out” of spam are widely considered to be unacceptable in general. Even “opt-out” or “unsubscribe” links in spam is widely considered untrustworthy for obvious reasons, though legitimate mailing lists will also have them. That idea came from the similar “do not call” registry for telephone marketing (to stop annoying marketing phone calls which were considered more annoying than spam of course), but email and internet advertising ended up being very different from telephone calls making these laws difficult to enforce. It is far easier to send an email than to call someone for example, and email is also more difficult to trace to the origin especially given that the Internet is global. FTC has a report at [4] describing these problems (it was a report to Congress that was required by CAN-SPAM), including the possibility that such a list can be abused by spammers for example. “Closed-loop opt-in” using confirmation emails for mailing lists on the other hand is widely accepted, but it is not mentioned in CAN-SPAM. One example includes the tracking of “opt-out” using cookies in things like AdChoices, which themselves can be used for other purposes obviously.

There are some reasons why these problems were not apparent (for example to Larry/Sergey) when Google bought DoubleClick, or when remarketing lists was shared, or for that matter when Urchin became Google Analytics and the data was merged with ad data.

The difficulty of researching things like the tying of remarketing lists during the writing of this essay shows some of the problems. It seems that no one cared about the privacy implications when remarketing lists in AdSense and DoubleClick was shared for example. In many cases, advertisers managed “remarketing” lists of “anonymous” visitors that was being tracked by cookies from a central console without thinking of the privacy problems, treating visitors almost as numbers. This ties in with the idea of treating people as “consumers” to be extracted from that are also fundamentally flawed. Another example of this is AOL that famously made it difficult to cancel at one point, partly because measuring “customer loyalty” as numbers to be extracted from consumers was part of their culture. To make it worse, they once charged consumers by the time spent on AOL, so the longer they stay the more revenue they made.

On the problem of malware, it probably can be assumed that no one cared as much about security when AdSense added Flash ads for example with exploits not as common as now. One of the first common exploits (dating back to the Morris worm in the late 1980s) was stack-based and sometime heap-based buffer overruns (using null-terminated C string copies that don’t limit the length copied for example), then the exploits got more sophisticated and complex (like use after free, return oriented programming and ASLR information leaks as some examples) especially as mitigation measures like stack canaries, NX and ASLR became common in response. It probably can be assumed that the market for exploit kits and zero day exploits and the like also probably took time to develop (though some of them was made famous by recent NSA leaks for example).

The Google-DoubleClick acquisitions was also controversial, with EPIC, CDD and US PIRG for example filing complaints with the FTC in April 2007, a “first supplement” to the complaint in June 2007, and a “second supplement” in September 2007. There was also a Senate hearing on Sept 27, 2007 with testimonies from a variety of sources regarding that issue. One of the concerns back then was aggregation of tracking data and lack of control by users, though other issues unrelated to ads like storage of IP addresses by search engines were also mentioned. Ultimately it took the FTC until the end of 2007 to approve the deals, after a “second request”.

Before the Google-DoubleClick acquisition, DoubleClick was once planned to merge with Abacus. FTC blocked the merger because of the privacy problems and it never happened. Abacus Direct seems to be a market researching company targeting consumer buying behavior. As a result, Abacus had a lot of personal info about consumers, and there were concerns that this data could be merged with DoubleClick data and may be used to deanonymize them.

In 2012, Jonathan Mayer discovered that Google used some tricks in JavaScript to allow tracking in Safari. It involved how Google was able to bypass cookie blocking policy in Safari by using an invisible form to fool Safari into allowing cookies. FTC fined Google $22.5 million over this behaviour, and more recently there has been lawsuits about it in the UK. There has been also a class action lawsuit about this in the US. Google argued the tracking was unintentional at the time and that it was related to Google+ “Plus” buttons on DoubleClick ads (for logged-in users I believe). It is probably worth mentioning here that a lot of these kind of buttons (like Facebook’s Like buttons, to name another example) do their own tracking too (they generally worked by using IFRAMEs to the website involved), and this has been well known for years. For example, according to [6] Facebook started using the tracking Like buttons to target ads in 2015. I think the Facebook-WhatsApp acquisition story is also famous by now BTW, including how they eventually allowed data sharing between the two (presumably after years of losses). It is worth mentioning how even the WhatsApp founders now recommend deleting Facebook (especially after the Cambridge Analytica debacle).

Now, let’s discuss Mozilla. Brendan Eich was the creator of JavaScript at Netscape when it was invented in 1995 and was the CTO of Mozilla Corporation from 2005 to 2014. After he stepped down from Mozilla in 2014 (just after he became CEO and after bad publicity stemming from his political donations about things like gay marriage), he was one of the founders of Brave with its Basic Attention Token etc. Andreas Gal joined Mozilla in 2008 and was the CTO from 2014 until 2015 when he left Mozilla.

Mozilla signed the Google search deal in 2004, before Google even IPOed (let alone things like DoubleClick). Mozilla switched to a Yahoo search deal in late 2014 (by then the search engine was based on MS’s Bing I think), which was part of Marissa Mayer’s attempt to fix Yahoo before it was sold to Verizon. Recently Mozilla switched back to Google as the default search engine.

BrendanEich mentioned in [7] that “It's not a simple Newtonian-physics (or fake economics based on same) problem.” This was about the history of the Google search deal with Mozilla and the fact that it was signed before Google IPOed (when it was being funded by VCs). It is worth mentioning here that Google was founded in 1998 when the now famous dot-com bubble was at the peak and VC funding was common (allowing many startups to grow fast which was considered more important than profits). Many other dot-com startups at the time had problems and ended up failing when the bubble collapsed around 2001. It is worth mentioning that the DoubleClick acquisition dates back to 2007 which was just before the housing bubble famously collapsed leading to another recession, and that bubble probably started just after the dot-com bubble.

BrendanEich mentioned in [8] that “A friend said in 2003 that Sergey declared G would not acquire display ads & arb. Search vs. Display as that would be “evil”.”, before Google even IPOed (in 2004). Unfortunately no other source was given.

It was mentioned on Twitter that Firefox OS enabled tracking protection by default unlike desktop Firefox. It was mentioned in [9] that “Yup. I was able to sneak that past management”. I then asked “I wonder if you ever talked to Larry/Sergey.” and Brendan then answered that Andreas didn’t of course. I wonder what would have happened if they did.

[10] has some information on the effect of EU GDPR on Google ads. Notice that AdWords comply if all “personalization” features are removed for example. This included things like “remarketing”. I suspect that AdWords when it was first created in 2000 did not have these features. Other features like “remarketing lists for search ads” are also listed as not compliant, which was of course probably added later too. There was also the infamous cookie law that required notification for placing cookies, which was not that effective but a major step in the direction given that most ad tracking (including DoubleClick) were based on cookies.

Data breaches are also a problem. The AOL search data breach from 2006 is pretty famous. The data was “anonymized” but the search terms was often enough to deanonymize users. Ad tracking data is likely similar, including browsing history and the like. Anonymizing data is a useful technique to avoid accidental abuse, but some kinds of data are hard to anonymize in a way that prevent all abuse. For example, various techniques for anonymizing IP addresses and MAC addresses has been developed, including hashing and truncation. Of course, the more data that is consolidated and collected, the higher the risk and impact of a breach.

Of course, it is worth noting that Google/DoubleClick isn’t the only one involved in the ad bubble (though DoubleClick was one of the first to do ad tracking I think). I think Taboola is often considered even worse than Google for example. The same fundamental problems with tracking and malware ads and the ad bubble etc. however tends to apply to all of the ad networks. Some of the worse ones may use browser fingerpointing via things like JavaScript, which is even worse than the tracking via cookies that is most commonly used. Browser fingerpointing is generally difficult to prevent on the browser side, but it is so famous that the WHATWG HTML spec mentions it and marks the parts of the spec where there is a risk. For example the list of browser plugins (navigator.plugins in JavaScript) could be used at one point (it used not to be sorted so it would be unique for each user, which made the fingerpointing even easier), but fortunately plug-ins are dying off anyway. EFF created Panopticlick which illustrated some of the fingerpointing that was possible, and other examples that became famous included Evercookie by Samy Kamkar. To make things worse, many plugins like Flash had their own cookies as well (though browsers have been getting better at clearing them). It is also worth noting that the current tracking ads are not the only kind of web advertising. There are so-called “first-party” and “third-party” ads and cookies. Example of first-party ads includes Twitter and Reddit ads. Example of third-party ads includes DoubleClick and Taboola ads. First-party ads don’t have the issues described here, but can still be annoying.

Recently, Google’s ad blocking and “better ads” (including so-called Better Ad Alliance) involves annoying ads, but don’t fix the fundamental issues described here. Apple’s ad blocking targets retargeting by limiting the life of cookies for example (making them less effective for tracking), but does not change the display of ads or make ads less annoying (for example, autoplay video ads are pretty famous as well, especially with Flash).

Now, fixing the problems might be difficult. Obviously it would affect not only shareholders but pretty much everyone else if Google completely got rid of tracking ads. This includes sites depending on Google ads for revenue as well as Google itself. One example here is that both Microsoft and Novell used Client Access Licenses (CALs). CALs (called node licenses by Novell I think) are per user or per computer licenses common in server software like NetWare and Windows Server. Of course, when Novell moved to Linux, it was open source software that didn’t have CALs (Like with Red Hat, the company only paid for support) meaning that Novell could not expect the same level of revenue as in the NetWare days (they moved to Linux by buying SUSE). The story about Sun’s open source projects and Jonathan Schwartz (the former “ponytail” CEO), and how they eventually had to sell to Oracle is probably pretty famous as well (some examples of open source projects from that period included OpenSolaris, OpenOffice, and OpenJDK). The ad bubble will probably not last forever though. Bubbles like this one is part of the problem of the current debt-based economy (the main problem is that it allows almost infinite amounts of “debt” in US dollars since we got off the gold standard in 1971, including most commonly government debt), especially it encourage extracting as much money as possible from so-called “consumers” (another example is Adobe Creative Cloud subscriptions and how Adobe’s stock price rose after it was implemented).

Google was famous for offering high amounts of storage in Gmail since the launch in 2004 (in comparison to other webmail services which offered relatively little storage), not to mention that the size of the search index also probably grows over time. According to [11] as of mid-2016, “Google indexes 60 trillion web pages according to “How Search Works.” It takes over 100 petabytes (equivalent to 100,000 1TB hard drives) to store it all. For comparison Google’s web index was 1 trillion pages in 2008 and in 2000 it was a meager 1 billion.” This is obviously faster than how hard drive capacity has been increasing. YouTube also consumes a lot of storage space because of all the videos obviously. YouTube started in 2005 and was bought by Google in late 2006, and it made Flash video using H.264 famous on the web (though it is now being replaced by HTML video and VP8/VP9/AV1 that is royalty-free as Flash becomes obsolete). (YouTube also has its own advertising BTW, often ads that plays before the video starts) This obviously means the amount of revenue Google makes always have to grow (since storage costs always increase), or eventually profit margins would decline. While in retrospect search ads probably never grew forever in the first place, this is particularly hard during recessions like those in 2007-2008. According to [12], Internet advertising declined the least in Q1 2009 but still declined. This is still an issue with cloud providers offering “unlimited” storage to users that gets abused to store excessive data. A most recent example is Amazon where some users were touting being able to store more than 1PB, leading them to end unlimited storage eventually. Another example is that Backblaze offered “unlimited” storage for data backup but deleted the data after a specific time of no use for example.

For Mozilla, a good example to illustrate the problems with funding browser development is the Opera browser. It was founded in 1995 in Norway. First browser was released in 1996. It IPOed in 2004. The browser used its own engine and it had a lot of unique features, like relatively good CSS support early on (unlike Netscape 4 at the time which famously had relatively poor support and was a problem for web developers for years). At first it was officially a paid browser with a trial version (like Netscape was before 1998), but later they used ads (choices included banner ads or text-based Google ads) for non-paying customers. They eventually signed a search deal with Google which removed the ads and instead just used Google as the default search engine (like Mozilla’s). Of course, there wasn’t much profit margin in a web browser, and so they had to cut costs to keep stocks and quarterly earnings going up (so planning for the future was difficult for example). It was strong in the mobile world before WebKit became dominant there though (before things like iPhone and Android and when things like WML was common) and may still be strong in some embedded applications, with products like Opera Mini that was basically remote rendering of web pages (useful when devices had less processing power). Opera never had much market share (though it had plenty of fans back in the day), and in the end Opera had to switch to Chromium (with the Blink engine) instead of their own engine and codebase in the desktop browser (though they did release last updates for the old one that included for example TLS enhancements). Opera was eventually sold to a Chinese consortium, which eventually renamed the company Otello. The founders eventually started the Vivaldi browser, which is also based on Chromium/Blink but has many differences. In contrast, the Mozilla Foundation was created as a non-profit organization as the old Netscape was dying off with AOL’s help (AOL bought Netscape in 1998 BTW). It owns a for-profit Mozilla Corporation for tax reasons (non-profits are not subject to taxes that for-profits have in the US). I think the corporation owns the search deals like Yahoo and Google for example. You can still donate to the Mozilla Foundation today. Mozilla Firefox 1.0 was released in 2004 after the Foundation was created (and after the branded Netscape 6/7 releases) and quickly took market share from the dominant IE6 that was stagnating the web (by being virtually unchanged for a long time without any real development) and was also well known for security problems like the Download.Ject attacks. MS was forced to respond with IE6 in Windows XP SP2 which in addition to security enhancements also added a few features like pop-up blocking and IE7 which finally bought real enhancements to the core engine that help web developers (especially in places like CSS). The old Netscape search deal with Google dates back to 1999 (obviously Netscape.com was Netscape’s home page at the time), and the success of the deal probably inspired the later Google search deal that Mozilla did.

One alternative to the current tracking ads is called Basic Attention Token. Basic Attention Token is based on the Ethereum cryptocurrency and blockchain (this is like Bitcoin but it is GPU minable for example using a different algorithm and it is one of the most popular GPU minable coins). It was created by the Brave browser, which supports it directly. It is intended to “directly measure” attention. “Attention” is measured on the client side (based on local browser history) and tokens are rewarded for them (called “basic attention metrics”), eliminating the privacy issues. This is often called a “zero-knowledge proof”. There are also other benefits like reducing so-called “click fraud” that hurts advertisers that is a common problem with current ads and removing the need for intermediaries that do tracking like DoubleClick and Taboola (so advertisers also gets more of the money too since they don’t have to pay them). Many other kinds of tokens and “smart contracts” has been created on Ethereum, and so-called initial coin offerings (ICOs) has been the most common use of Ethereum (helping the price to rise). Of course, there is little to no regulation for them at the moment which results in many scam ICOs too (they tends to raise money very quickly, partly since it is so easy to give coins to them).

There are also systems for paying authors directly like Pateron, though it is also trivial to use PayPal or cryptocurrencies for this purpose (though also harder to donate). Pateron allow money to be “pledged” to specific authors. There are also many kinds of “paywalls” implemented on websites, many of which has their own problems like relying on cookies to track how many times people visited a site (to limit the number before the user have to pay of course) or making it difficult to post links on Slashdot, Reddit, and Hacker News that often dislike paywalls for obvious reasons (though some are better than others).

Of course, the problems described in the essay as well as other problems of ads (including annoyance and performance cost of ads) led to more use of ad blockers, which also have their own history. Banner ad blindness has also been known for years now, and Google’s ads tends to be simple text-based ads at least initially. One of the first type of blocking was popup blockers, and Google was taking a stand against popups in the early days (they were well known to be annoying). They became common in browsers by the mid-2000s (even IE6 in XP SP2 had them). At one point circa 2002, AOL/Netscape was disabling the popup blocker from Netscape-branded Mozilla releases (at one time there was the Mozilla source code/binaries and the official Netscape-branded builds based on the Mozilla source). Of course after user backlash they backed off from doing so. This was long before Google bought DoubleClick for example. Later more sophisticated ad and cookie blockers like AdBlock Plus and uBlock Origin came out as add-ons to browsers like Firefox, and one is built into Brave of course (along with BAT as a replacement for the lost ad revenue). Many other browsers have also similar tracking protection including Firefox and IE, but they just disable them by default and may require that ad blocking lists (such as EasyList) be manually loaded. Of course, some sites has been attempting to detect ad blockers and ask users to turn them off (even Ars Technica did it at one point though it only lasted one day), which is also ineffective and not a good idea for obvious reasons (including the fact that it reflects badly on the sites that are doing it). Lawsuits against ad blockers was also tried in some countries, which was obviously mostly unsuccessful (like a lawsuit against AdBlock Plus in Germany by publishers there).

There are many problems with the current tracking ads, but the worst ethical issues are malware and tracking. These problems led to people using ad blockers for example. There are alternatives like Basic Attention Token, but even with things like that fixing the problems might be difficult. For example, the effects would probably be serious (for everyone, not just shareholders) if Google got rid of tracking ads. Part of the reason tracking ads became so popular and would be difficult to remove completely from the web was the way the debt-based economy encourages extracting as money as possible from consumers. There are reasons why Larry/Sergey didn’t realize the problems when they bought DoubleClick for example, including that no one cared as much about security back then.

[1]"Zero-Day Exploit - Security News - Trend Micro USA", Trendmicro.com, 2018. [Online]. Available: https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. [Accessed: 06- Apr- 2018].
[2]A. Klaassen, "Google Turns to Behavioral Targeting to Beef up Display Biz", Adage.com, 2009. [Online]. Available: http://adage.com/article/digital/google-turns-behavioral-targeting-beef-display-ads/135152/. [Accessed: 06- Apr- 2018].
[3]"Share Google Analytics data and remarketing lists more efficiently using manager accounts (MCC)", Inside AdWords, 2018. [Online]. Available: https://adwords.googleblog.com/2015/11/share-google-analytics-data-and.html. [Accessed: 06- Apr- 2018].
[4]"The CAN-SPAM Act of 2003: National Do Not Email Registy: A Federal Trade Commission Report to Congress", Federal Trade Commission, 2018. [Online]. Available: https://www.ftc.gov/reports/can-spam-act-2003-national-do-not-email-registy-federal-trade-commission-report-congress. [Accessed: 06- Apr- 2018].
[5]"Introducing the Google Analytics 360 Suite", Analytics Blog, 2018. [Online]. Available: https://analytics.googleblog.com/2016/03/introducing-google-analytics-360-suite.html. [Accessed: 06- Apr- 2018].
[6]T. Simonite, "Facebook Will Now Target Ads Based on What Its Like Buttons Saw You Do", MIT Technology Review, 2018. [Online]. Available: https://www.technologyreview.com/s/541351/facebooks-like-buttons-will-soon-track-your-web-browsing-to-target-ads/. [Accessed: 06- Apr- 2018].
[7]"BrendanEich on Twitter", Twitter, 2018. [Online]. Available: https://twitter.com/BrendanEich/status/932747825833680897. [Accessed: 06- Apr- 2018].
[8]"BrendanEich on Twitter", Twitter, 2018. [Online]. Available: https://twitter.com/BrendanEich/status/932473969625595904. [Accessed: 06- Apr- 2018].
[9]"andreasgal on Twitter", Twitter, 2018. [Online]. Available: https://twitter.com/andreasgal/status/932757853504339968. [Accessed: 06- Apr- 2018].
[10]D. Ryan, "How the GDPR will disrupt Google and Facebook", PageFair, 2017. [Online]. Available: https://pagefair.com/blog/2017/gdpr_risk_to_the_duopoly/. [Accessed: 06- Apr- 2018].
[11]Y. Paul, "How large is the Google search index, as of mid-2016?", Quora, 2018. [Online]. Available: https://www.quora.com/How-large-is-the-Google-search-index-as-of-mid-2016. [Accessed: 06- Apr- 2018].
[12]"Nothing to shout about", The Economist, 2009. [Online]. Available: https://www.economist.com/node/14140373. [Accessed: 06- Apr- 2018].
[13]"Google Analytics Summit 2014: What’s Next And On The Horizon For Analytics", Analytics Blog, 2014. [Online]. Available: https://analytics.googleblog.com/2014/05/google-analytics-summit-2014-whats-next.html. [Accessed: 06- Apr- 2018].


Thursday, March 29, 2018

Google DoubleClick Mozilla essay third draft


There are many problems with web advertising in general, including some of them being annoying to users (like autoplay video ads and pop-ups) and also problems like click fraud which matter to advertisers. But the ethical issues with them are the most important, including malware like exploit kits and tracking ads. I will be focusing on the ethical issues with some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them when buying DoubleClick for example. This essay will also talk about Mozilla and how they are involved (like in the Google/Mozilla search deal), including Brendan Eich who created JavaScript that eventually left Mozilla to found Brave. I will be also be talking about the difficulty of solving these issues.

Google was founded in 1998 by Larry Page and Sergey Brin while at Stanford, and took VC funding from KP and other partners. Eric Schmidt was bought in as CEO in 2001 and recently left but are still on the board. Google IPOed in 2004, using dual class stock for example.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was also taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

Google bought Urchin in 2005, turning it into Google Analytics. Initially its product was to analyze web server log files, with JavaScript tags being added later.

One of the problems of ads is malware. Typically websites take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder to make enough revenue to support them and often websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable (obviously because they are unpatched).

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers in 2016 (discovered by Brian Baskin). Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place, and it illustrated some of the flaws discussed here. This exploit kits also hit sites like MSN.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved. I think AdSafe worked by creating a limited sandbox to prevent things like XSS attacks.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

According to http://adage.com/article/digital/google-turns-behavioral-targeting-beef-display-ads/135152/, “In December 2008 Google added DoubleClick cookies to AdSense ads”, tying the DoubleClick cookie-based tracking (dating long before Google bought it) to AdSense. I assume that AdSense tracking probably did not exist before Google bought DoubleClick. Google Analytics added AdWords and AdSense support in 2009. In 2012, Google changed its privacy policy to allow data to be consolidated, which was also very controversial. In 2014, Google Analytics integrated with DoubleClick, allowing things like remarketing lists to be shared. Remarketing lists for search ads (tied to Google Analytics) was introduced in 2015. Remarketing lists are basically lists of website visitors that can be uniquely identified by things like cookies, and it is one of the ways of targeting ads to users. I would assume that sharing remarketing lists basically ties the tracking together.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others, such as AdChoices for example.

So why didn’t Larry/Sergey consider the ethical and other issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now. One of the first common exploits (dating back to the late 1990s) was stack-based and sometime heap-based buffer overruns (using null-terminated C string copies that don’t limit the length copied for example), then the exploits got more sophisticated and complex (like use after free and ASLR information leaks used to disclose addresses as an example) especially as mitigation measures like stack canaries, NX and ASLR became common in response. I assume that the market for exploit kits and zero day exploits and the like also probably took time to develop (though some of them was made famous by recent NSA leaks for example).

Before the Google-DoubleClick acquisition, DoubleClick was once planned to merge with Abacus. FTC blocked the merger because of the privacy problems (especially problems with deanonymizing users) and it never happened. Abacus Direct seems to be a market researching company targeting consumer buying behavior. As a result, Abacus had a lot of personal info about consumers, and there were concerns that this data could be merged with DoubleClick data.

The Google-DoubleClick acquisitions was also controversial, with EPIC for example filing complaints with the FTC. There was also a Senate hearing on Sept 27, 2007 with testimonies from a variety of sources regarding that issue. One of the concerns back then was aggregation of tracking data and lack of control by users.

In 2012, Jonathan Mayer discovered that Google used some tricks in JavaScript to allow tracking in Safari. It involved how Google was able to bypass cookie blocking policy in Safari by using an invisible form to fool Safari into allowing cookies. FTC fined Google $22.5 million over this behaviour, and more recently there has been lawsuits about it in the UK. Google argued the tracking was unintentional at the time and that it was related to Google+ “Plus” buttons on DoubleClick ads (for logged-in users I believe). It is probably worth mentioning here that a lot of these kind of buttons (like Facebook’s Like buttons, to name another example) do their own tracking too (they generally worked by using IFRAMEs to the website involved), and this has been well known for years. For example, according to https://www.technologyreview.com/s/541351/facebooks-like-buttons-will-soon-track-your-web-browsing-to-target-ads/ Facebook started using the tracking Like buttons to target ads in 2015. I think the Facebook WhatsApp acquisition story is also famous by now BTW, including how they eventually allowed data sharing between the two. It is worth mentioning how even the WhatsApp founders now recommend deleting Facebook (especially after the Cambridge Analytica debacle).

Now, lets talk about Mozilla. Brendan Eich was the creator of JavaScript and was the CTO of Mozilla Corporation from 2005 to 2014. After he stepped down from Mozilla in 2014, he started Brave with its Basic Attention Token etc. Andreas Gal joined Mozilla in 2008 and was the CTO from 2014 until 2015 when he left Mozilla.

Mozilla signed the Google search deal in 2004, before Google even IPOed (let along things like DoubleClick). Mozilla switched to a Yahoo search deal in late 2014. Recently Mozilla switched back to Google as the default.

BrendanEich mentioned in https://twitter.com/BrendanEich/status/932747825833680897 that “It's not a simple Newtonian-physics (or fake economics based on same) problem.” This was about the history of the Google search deal with Mozilla and the fact that it was signed before Google IPOed (when it was being funded by VCs). It is worth mentioning here that Google was founded in 1998 when the now famous dot-com bubble was at the peak and VC funding was common (allowing many startups to grow fast which was considered more important than profits). Many other dot-com startups at the time had problems and ended up failing when the bubble collapsed around 2001. It is worth mentioning that the DoubleClick acquisition dates back to 2007 which was just before the housing bubble famously collapsed leading to another recession, and that bubble probably started just after the dot-com bubble.

It was mentioned on Twitter that Firefox OS enabled tracking protection by default unlike desktop Firefox. It was mentioned in https://twitter.com/andreasgal/status/932757853504339968 that “Yup. I was able to sneak that past management”. I then asked “I wonder if you ever talked to Larry/Sergey.” and Brendan then answered that Andreas didn’t of course. I wonder what would have happened if they did.

https://pagefair.com/blog/2017/gdpr_risk_to_the_duopoly/ has some information on the effect of EU GDPR on Google ads. Notice that AdWords comply if all “personalization” features are removed for example. This included things like “remarketing”. I suspect that AdWords when it was first created in 2000 did not have these features. Other features like “remarketing lists for search ads” are also listed as not compliant, which was of course probably added later too.

One of the first type of blocking was popup blockers, and Google was taking a stand against popups in the early days (they were well known to be annoying). They became common in browsers by the mid-2000s (even IE6 in XP SP2 had them). At one point circa 2002, AOL/Netscape was disabling the popup blocker from Netscape-branded Mozilla releases (for example the original Netscape 7 release I think). Of course, this was long before Google bought DoubleClick for example. Later more sophisticated ad and cookie blockers like AdBlock Plus and uBlock Origin came out as add-ons to browsers like Firefox, and one is built into Brave of course (along with BAT as a replacement for the lost ad revenue). Many other browsers have also similar tracking protection including Firefox and IE, but they just disable them by default.

Of course, it is worth noting that Google/DoubleClick isn’t the only one involved in the ad bubble (though DoubleClick was one of the first to do ad tracking I think). I think Taboola is often considered even worse than Google for example. The same fundamental problems with tracking and malware ads and the ad bubble etc. however tends to apply to all of the ad networks.

Recently, Google’s ad blocking and “better ads” (including so-called Better Ad Alliance) involves annoying ads, but don’t fix the fundamental issues described here. Apple’s ad blocking targets retargeting by limiting the life of cookies for example (making them less effective for tracking), but does not change the display of ads or make ads less annoying (for example, autoplay video ads are pretty famous as well, especially with Flash).

Now, fixing the problems might be difficult. One example here is that both Microsoft and Novell used CALs. CALs (called node licenses by Novell I think) are per user or per computer licenses common in server software like NetWare and Windows Server. Of course, when Novell moved to Linux, it was open source software that didn’t have CALs (the company only pays for support) meaning that Novell could not expect the same level of revenue as in the NetWare days (they moved to Linux by buying SUSE). The story about Sun’s open source projects and Jonathan Schwartz (the former “ponytail” CEO), and how they eventually had to sell to Oracle is probably pretty famous as well (some examples included OpenSolaris, OpenOffice, and OpenJDK). The ad bubble will probably not last forever though. This is part of the problem of the current debt-based economy (which allows almost infinite amounts of money to be printed), especially how it encourage extracting as much money as possible from so-called “consumers” (another example is Adobe Creative Cloud subscriptions and how Adobe’s stock price rose).

Google was famous for offering high amounts of storage in Gmail since the launch in 2004, not to mention that the size of the search index also probably grows over time. This obviously means the amount of revenue Google makes always have to grow (since storage costs always increase), or eventually profit margins would decline. This is particularly hard during recessions like those in 2007-2008. According to https://www.economist.com/node/14140373, Internet advertising declined the least in Q1 2009 but still declined. This is still an issue with cloud providers offering “unlimited” storage to users that gets abused to store excessive data (most recent example is Amazon where some users was touting being able to store more than 1PB, leading them to end unlimited storage).

Saturday, March 24, 2018

Google DoubleClick Mozilla essay second draft

This essay will describe the history of Internet advertising at Google. I will also talking about the ethical issues of some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them for example. Of course, it is worth noting that Google isn’t the only one involved in the ad bubble. This essay will also talk about Mozilla, including Brendan Eich who created JavaScript.

Google was founded in 1998 by Larry Page and Sergey Brin while at Stanford, and took VC funding. Eric Schmidt was bought in as CEO in 2001 and recently left but are still on the board. Google IPOed in 2004.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was also taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

Google bought Urchin in 2005, turning it into Google Analytics. Initially its product was to analyze web server log files, with JavaScript tags being added later.

One of the problems of ads is malware. Typically advertisers take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder and sometimes websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable.

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers. Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

Google Analytics added AdWords and AdSense support in 2009. In 2012, Google changed its privacy policy to allow data to be consolidated, which was also very controversial. In 2014, Google Analytics integrated with DoubleClick, allowing things like remarketing lists to be shared. Remarketing lists for search ads (tied to Google Analytics) was introduced in 2015. Remarketing lists are basically lists of website visitors that can be uniquely identified by things like cookies, and it is one of the ways of targeting ads to users. Sharing remarketing lists basically ties the tracking together.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others.

So why didn’t Larry/Sergey consider the issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now. I assume that the market for exploit kits and zero day exploits and the like took time to develop.

Before the Google-DoubleClick acquisition, DoubleClick was once planned to merge with Abacus. FTC blocked the merger because of the privacy problems (especially problems with deanonymizing users) and it never happened.

The Google-DoubleClick acquisitions was controversial, with EPIC for example filing complaints with the FTC. There was also a Senate hearing on Sept 27, 2007 with testimonies from a variety of sources regarding that issue. One of the concerns was aggregation of tracking data and lack of control by users.

Now, lets talk about Mozilla. Brendan Eich was the creator of JavaScript and was the CTO of Mozilla Corporation from 2005 to 2014. After he stepped down from Mozilla in 2014, he started Brave with its Basic Attention Token etc. Andreas Gal joined Mozilla in 2008 and was the CTO from 2014 until 2015 when he left Mozilla.

Mozilla signed the Google search deal in 2004, before Google even IPOed (let along things like DoubleClick). Mozilla switched to a Yahoo search deal in late 2014. Recently Mozilla switched back to Google as the default.

BrendanEich mentioned in https://twitter.com/BrendanEich/status/932747825833680897 on the Google search deal and history of Google that “It's not a simple Newtonian-physics (or fake economics based on same) problem.”

It was mentioned that Firefox OS enabled tracking protection by default unlike desktop Firefox. It was mentioned in https://twitter.com/andreasgal/status/932757853504339968 that “Yup. I was able to sneak that past management”.

Google’s ad blocking and “better ads” involves annoying ads, but don’t fix the issues described here. Apple’s ad blocking targets retargeting, but does not change the display of ads or make ads less annoying.

Tuesday, March 13, 2018

Google DoubleClick essay first draft

Note: This is the first draft. Many issues like Mozilla and Google Analytics are not covered in detail yet. Final essay will be posted in April. Thanks Brendan Eich for the inspiration for the essay.

This essay will describe the history of Internet advertising at Google. I will also talking about the ethical issues of some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them for example. Of course, it is worth noting that Google isn’t the only one involved in the ad bubble.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

One of the problems of ads is malware. Typically advertisers take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder and sometimes websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable.

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers. Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others.

So why didn’t Larry/Sergey consider the issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now.

Google’s ad blocking involves annoying ads, but don’t fix the issues described here. Apple’s ad blocking targets retargeting, but does not make ads less annoying.

Thursday, December 14, 2017

Google, Mozilla, and the debt-based economy

Thread 1:
From https://twitter.com/BrendanEich/status/932020295384178688:
"From me to you (I'm still bound by non-disclosure agreements with Mozilla), when we started the first Google/Firefox search deal in 2004, it was a better world. Pre-Doubleclick/YouTube Google, for one. Pre-programmatic ad tech too. Important not to equate threat from then to now."
From https://twitter.com/yuhong2/status/932456215048724481:
"There is a reason why I asked about your meeting with Google founders in 2005 around the time they acquired Urchin."
From https://twitter.com/yuhong2/status/932463189547278337:
"I mean, did you care about Google Analytics back then?"
From https://twitter.com/BrendanEich/status/932463682822422528:
"No, it was not tied into search ads or anything like what doubleclick brought to the table. We've been over this. That was a different era. Maybe you saw farther than I did -- if so, good for you!"
From https://twitter.com/yuhong2/status/932463850237997056:
"And do you think it is the founder's fault?"
From https://twitter.com/yuhong2/status/932470729144020992:
"AFAIK they were against popup ads in the early days."
From https://twitter.com/BrendanEich/status/932473969625595904:
"A friend said in 2003 that Sergey declared G would not acquire display ads & arb. Search vs. Display as that would be “evil”. But going public in 2004 inevitably meant growth, arb-opptys, monopoly power. Capitalism 101, I said recently.

Could search-only G have become a utility?"
From https://twitter.com/yuhong2/status/932474138450542592:
"Yea, part o the problem is that it took VC funding so it had to IPO or sell to exit."
From https://twitter.com/yuhong2/status/932475230748008448:
"I said before that VC might not quite be debt, but it is close enough for this discussion."

Thread 2:
From https://twitter.com/jwajsberg/status/932746958703349761 :
"Totally agree. I don't get why we don't do that now, in this time where we want to take risks again. Note we enabled tp by default in Firefox os..."
From https://twitter.com/andreasgal/status/932757853504339968 :
"Yup. I was able to sneak that past management"
From https://twitter.com/yuhong2/status/932760376294359040 :
"I wonder if you ever talked to Larry/Sergey."
From https://twitter.com/BrendanEich/status/932761563617837057:
"He didn’t - will you give it a rest?! It wasn’t that explicit back in day, and more recently it was Sundar’s people not Sergey or Larry!"
From https://twitter.com/yuhong2/status/932761950848557057:
"I wonder what the discussion would be like if they did."

Thread 3:
From https://twitter.com/yuhong2/status/932747119009546240 :
"Thinking about it, the Firefox/Google search deal was probably before or during the IPO and they were VC funded before that, right?"
From https://twitter.com/BrendanEich/status/932747375986163712 :
"It was pre-IPO. They were definitely thinking about that but they were also naive. Both founders said things like "we can defy public markets and take losses doing what is right". Lol."
From https://twitter.com/yuhong2/status/932747653963702273 :
"I am mainly talking about where the funding came from though."
From https://twitter.com/BrendanEich/status/932747825833680897 :
"It's not a simple Newtonian-physics (or fake economics based on same) problem."
From https://twitter.com/yuhong2/status/932747980272103424 :
"Yea, part of the problem is how the current debt based economy works in the first place."

A few more:
From https://twitter.com/yuhong2/status/933611862268174336:
"Thinking about it, if Google showed no growth, it is not the end of the world but things like stock options would worth less, right?"
From https://twitter.com/yuhong2/status/934647951518920705:
"It probably doesn't help that things like storage costs scale with the size of the index not the number of searches per day or the like."
From https://twitter.com/yuhong2/status/934655735366959104:
"Imagine the search revenue don't grow but the size of the search index still grows."
From https://twitter.com/yuhong2/status/934656082214993923:
"Though per-GB storage cost is cheaper today than it was last decade."

Monday, December 26, 2016

NT 4.0, .NET 1.1, and INTLFXSR.SYS problems

Here is the code from a disassembly of INTLFXSR.SYS with symbols:
.text:000102A0 ; __stdcall FxsrGetProcessorFeatures()
.text:000102A0                 public _FxsrGetProcessorFeatures@0
.text:000102A0 _FxsrGetProcessorFeatures@0 proc near   ; CODE XREF: DriverEntry(x,x)+61 p
.text:000102A0                 push    edi
.text:000102A1                 push    esi
.text:000102A2                 push    ebx
.text:000102A3                 pushf
.text:000102A4                 pop     eax
.text:000102A5                 push    eax
.text:000102A6                 mov     ecx, eax
.text:000102A8                 xor     eax, 40000h
.text:000102AD                 push    eax
.text:000102AE                 popf
.text:000102AF                 pushf
.text:000102B0                 pop     eax
.text:000102B1                 cmp     ecx, eax
.text:000102B3                 jz      short cpu_is_i386
.text:000102B5                 mov     eax, ecx
.text:000102B7                 xor     eax, 200000h
.text:000102BC                 push    eax
.text:000102BD                 popf
.text:000102BE                 pushf
.text:000102BF                 pop     eax
.text:000102C0                 cmp     ecx, eax
.text:000102C2                 jz      short other_cpu
.text:000102C4                 mov     eax, 0
.text:000102C9                 cpuid
.text:000102CB                 cmp     eax, 3
.text:000102CE                 jg      short cpu_identified
.text:000102D0                 mov     _VerifyIntel, ebx
.text:000102D6                 mov     dword_106C4, edx
.text:000102DC                 mov     dword_106C8, ecx
.text:000102E2                 lea     esi, _VerifyIntel
.text:000102E8                 lea     edi, _GenuineIntel ; "GenuineIntel"
.text:000102EE                 mov     ecx, 0Ch
.text:000102F3                 repe cmpsb
.text:000102F5                 jnz     short other_cpu
.text:000102F7                 mov     eax, 1
.text:000102FC                 cpuid
.text:000102FE                 mov     eax, edx
.text:00010300                 jmp     short cpu_identified
.text:00010302 ; ---------------------------------------------------------------------------
.text:00010302
.text:00010302 other_cpu:                              ; CODE XREF: FxsrGetProcessorFeatures()+22 j
.text:00010302                                         ; FxsrGetProcessorFeatures()+55 j
.text:00010302                 mov     eax, 0
.text:00010307                 jmp     short cpu_identified
.text:00010309 ; ---------------------------------------------------------------------------
.text:00010309
.text:00010309 cpu_is_i386:                            ; CODE XREF: FxsrGetProcessorFeatures()+13 j
.text:00010309                 mov     eax, 0
.text:0001030E
.text:0001030E cpu_identified:                         ; CODE XREF: FxsrGetProcessorFeatures()+2E j
.text:0001030E                                         ; FxsrGetProcessorFeatures()+60 j ...
.text:0001030E                 popf
.text:0001030F                 pop     ebx
.text:00010310                 pop     esi
.text:00010311                 pop     edi
.text:00010312                 retn
.text:00010312 _FxsrGetProcessorFeatures@0 endp
If you know x86 assembly, you will notice that it relies on a GenuineIntel CPU and for CPUID leaf 0 to return a value less than 3.
As for the .NET Framework 1.1 problems, the way to determine if SSE is supported is to first use CPUID to determine if the SSE bit is set. But there is also an extra step. Without CR4.OSFXSR set, SSE instructions will cause #UD. This can be caught on Windows as a SEH exception. My guess is that .NET 1.1 is not doing that, which is why it crashes without INTLFXSR.SYS properly loaded.

Tuesday, June 16, 2015

Why your Core 2 processor appear to not have CMPXCHG16B

From http://download.intel.com/design/processor/specupdt/318733.pdf :
"AW67. Enabling PECI via the PECI_CTL MSR Does Not Enable PECI and May Corrupt the CPUID Feature Flags
Problem: Writing PECI_CTL MSR (Platform Environment Control Interface Control Register) will not update the PECI_CTL MSR (5A0H), instead it will write to the VMM Feature Flag Mask MSR (CPUID_FEATURE_MASK1, 478H).
Implication: Due to this erratum, PECI (Platform Environment Control Interface) will not be enabled as expected by the software. In addition, due to this erratum, processor features reported in ECX following execution of leaf 1 of CPUID (EAX=1) may be masked. Software utilizing CPUID leaf 1 to verify processor capabilities may not work as intended.
Workaround: It is possible for the BIOS to contain a workaround for this erratum. Do not initialize PECI before processor update is loaded. Also, load processor update as soon as possible after RESET as documented in the RS – Wolfdale Processor Family Bios Writers Guide, Section 14.8.3 Bootstrap Processor Initialization Requirements. "
The CMPXCHG16B feature flag is one of the flags that is reported in ECX.
This erratum only affects E0/R0 steppings of 45nm Core 2, as you can see in the Summary Table of Changes.
Generally a BIOS update will contain the needed microcode update mentioned above.
For those who have Intel motherboards, from https://communities.vmware.com/message/1765787 :
"I got fed up and went to Intel on this one.  One of their second level people finally gave me the suggestion that I should again flash the BIOS update, but use the method for full bios refresh, rather than the windows-based update process.  I suspect that the microcode fix referred to in AV69 is in a part of the bios core that is not updated unless you do the full refresh."