Updated 11/23/11
Nicholas Thompson has a new piece at the Culture Desk on The New Yorker’s web site called Tweeting Your Way to the Oval Office looking at the candidate’s Tweeting and Facebooking and subsequent Klout and Peek Analytics scores. He writes that “The old way to figure out how much you matter on the Internet was to count your Twitter followers and Facebook friends; the new way is to try to measure how “influential” those people are.” And Klout and Peek analyze and present “influence” scores. (My own Klout score is 21. Wow :) )
But I ended up writing the piece below on looking under the hood of the web sites because back in August, the (past) staff web site writer Samantha Henig took a look at Republican presidential candidate Rick Perry’s website after he jumped into the race.
She examined his site design and noted his (or his PR company’s) word choice in the site header of “Perry President” (without a “for”) that Henig said “only adds to the sense that this guy already has the job, and next November is just a technicality.” Henig sleuthed around and pointed out other aspects of Perry’s site that are curious, but reading her piece got me thinking: what’s under the hood of the site?
I wondered that because checking the source code of websites is something I do often when I come across a very nice (or in some cases very bad) website: looking at the code under the hood gives some clues about how the website looks and works the way it does.
What I mean by looking under the hood is examining the programing code that is used by the web browser. This is the code that is delivered by the web server to your browser – be it Firefox, Internet Explorer, Safari or one of the mobile browsers – and the browser uses to display the page, text, images and movies.
You can see this source code by choosing “View Source” from the menu of your browser or right clicking in a blank spot on the web page and selecting “View Source” in the pop up menu. Now, unless you’re a web programmer or know some basics of HTML – which stands for Hypertext Markup Language, one of the basic code languages of the world wide web – what you see will appear to be all Greek.
So I took a look at the source code of Perry’s site and then compared what I found with the other Republican candidate’s sites. Buried in the code are details of how the websites are built and optimized for search engines and social networking. I found that the candidates’ web teams have done some things in common and even spotted some code mistakes.
And so (mostly) aside from politics, this is a geek’s look at the candidate’s sites. I looked at the current three top dogs – Rick Perry, Mitt Romney and Michelle Bachmann – as well as those trailing – Newt Gingrich, Herman Cain, Jon Huntsman, Ron Paul – and the one presumed jumper, Sarah Palin. And I look at President Obama’s site, too.
I looked at three different aspects of these websites: one is the source code I’ve mentioned and what it says about the site, aspects of social networking and other details. The second are some interesting things I can tell about the actual computer servers that handle the web traffic; and third, the registrations of the domains themselves, i.e. who actually “owns” thiscandidate.com.
First, the source code. One thing I found interesting right is that all of the candidates use what’s called a Content management system (CMS). That’s a fancy way of saying they use a software package to make and update their websites, and the “content management” part of that software makes it easy for non-techy types to update a site. In the bad old days, one needed to directly edit the code files to make changes to a website. These days with a CMS, it’s point, click, type and punch a “Publish” button, much like using Microsoft Word.
Content Management Systems also make sense because the 24/7 news cycle requires a fast response for anything concerning the candidate. And that results in the need for many people to be working on the website at the same time: one person to blog, one person to update and edit position statements, one to moderate reader comments. That’s another thing Content Management Systems do that well: they can allow editors and writers to work at the same time, and allow deeper site access to some workers and minimum amounts of access to others.
The most popular CMS among the candidates is called WordPress, and is used by Bachmann, Perry, Paul and Cain. WordPress is the most popular CMS in use on the Internet. People visit the approximately 50 million WordPress-based sites around the world hundreds of millions of times a day. (For the Geeks out there, I need to mention that WordPress comes in two basic flavors: one is available for free at WordPress.com and is used by ~30 million bloggers, and the other is using a copy of that same, basic software from WordPress.org – also used by another ~25 million sites – and running it on your own web server for greater power and flexibility.)
WordPress is what’s known as Open source software. In the open source software world, many different programmers around the world contribute to a software project’s features and bug fixes, add their own plugins and make new versions. They do this work while leaving their code “open” so others can see it, modify it and use it in their own projects. And WordPress is free, too. Anyone can use WordPress.com or download from .org. (Some particular add-ons are made by private companies and not free.)
Open source software grew out of the the nascent computer culture long before the 1960’s (and the Internet), when early software developers recognized the power of what they were working on and wanted everyone to benefit. The bottom line is that open source is probably one of the reasons WordPress is so popular. Open source software in general is promoted by the Free Software Foundation.
While the four candidates I mentioned use WordPress, Romney, Huntsman, Gingrich use a CMS called Drupal, which is in a way more geeky and complex to use, but can be more powerful in terms of building complex sites. Drupal is also open source and free to download and use, but it hasn’t gained the traction WordPress has.
One interesting thing I recognized in the source code of Perry’s WordPress site is that he (or rather his web development group) is trying to hide that fact that the site runs WordPress. I can see only some tell-tale hints of WordPress (like a wp-includes directory, a body class that is WordPress and other xhtml) and that tells me that some other indications are purposefully hidden from view. But enough is revealed to say he is using WordPress and he’s trying to hide the source code “signature” that WordPress shows to those looking under the hood.
I doubt this is for any political ends; I think his web team is simply trying to make the site more secure against hackers, who cruise the web, looking for easy prey. WordPress has had security problems in the past, and there may be undiscovered vulnerabilities in the current version. Hiding the details of the software one uses is a quick way to be a bit more secure. It’s not perfect, but it’s called “security through obscurity.” (I used Perry’s website contact form for comment, but no luck.)
Bachmann also uses WordPress, but her web team hasn’t tried to hide anything; all the evidence of using WordPress is out in the open. And in fact, the site code has some amateurish errors in it. Her RSS news feed was actually broken for a week, which is not good for users. (It’s working again as of 8/30/11).
When it comes to what can be found about the use of social networking in source code – such as Facebook, Twitter and the like – Perry is out front in a curious way. His site uses what’s called Open Graph (OG) metatags, developed by Facebook, like many of the candidates. OG metatags are little bits of info that help Facebook find and index Perry’s site faster, and that makes Perry’s Facebook page make it easier to find on Facebook and make use of the Facebook “Like” feature, too.
But Perry also uses a cutting edge metadata protocol called Dublin Core (DC). These tags are still under development and not currently recognized by the popular search engines like Google and Bing. The Dublin Core Metadata Initiative says it is “an open organization engaged in the development of interoperable metadata standards that support a broad range of purposes and business models.”
So the DC people are looking toward the future when metatags are used for more than just search engines. And Perry’s web team seems to be thinking ahead of the curve, too. If DC tags suddenly become useful, his web team has already “been there, done that.”
Bachmann also uses Facebook Open Graph, and it’s interesting to read in the source code the website description Bachmann’s web team wants Facebook to use. It reads: “With spending at all time highs and jobs at historic lows, the stakes in 2012 are far too high to go it alone. That’s why Michele has assembled a committed team of constitutional conservatives to work together and ensure that Barack Obama is a one-term President. Will you join?” But Perry’s OG description is blank, which is probably a mistake.
As of 8/30/11 11/23/11, below are each of the candidate’s Facebook pages and the Facebook’s own tally of “Likes,” high to low. And Huntsman and Cain don’t use Facebook “Likes” on their own sites, but have Facebook pages). (Some countes combine the old “Share” with the newer “Likes.”) And some of these counts are cumulative from previous election cycles:
2) Ron Paul ronpaul12 233,000 596,000
3) Michele Bachmann teambachmann 462,000 459,000
4) Herman Cain THEHermanCain 169,000 392,000
5) Newt Gingrich newtgingrich 146,000 184,000
6) Rick Perry GovernorPerry 141,000 171,000
7) Jon Huntsman jonhuntsmanjr 14,000 24,000
Twitter gets a bit confusing because most candidates have multiple accounts, one for their campaigns, one or more for the candidate’s current/past lives as incumbents, and then there are other accounts established by supporters that may or may not have some sort of official status. And in the case of Michele Bachmann, there is at least one spoof site dedicated to her statements and positions. Some counts are cumulative from previous election cycles. As of 8/30/11 11/23/11, Twitter follower numbers from official campaign accounts are:
2) Mitt Romney MittRomney 80,000 174,000
3) Herman Cain THEHermanCain 63,000 170,000
4) Rick Perry TeamRickPerry 15,000 105,000
5) Jon Huntsman Jon2012HQ 3,200 52,000
6) Ron Paul RonPaul 49,000 42,500
7) Michele Bachmann TeamBachmann 29,000 33,800
As for other social networking services: curiously, only Romney, Cain and Bachmann use Google’s new GPlus.com social network.
As for the second aspect concerning the candidate’s sites – finding out details on the computer servers that actually runs the websites and delivers the pages to the web browsers of viewers – one needs to use a service like Robtex instead of looking at the source code. That site provides all of the publicly available information about domains, who owns the registrations, the servers associated with a website, and the complexities of internet routing in general. Any service like Robtext can be daunting to use, but they are very useful tools. Just go to Robtex, type in the URL of the website you want to examine, and you will get all the technical info that is publicly available.
The details of servers and domain registrations come from what’s called DNS, which stands for Domain Name System. Robtex allows easy searches of DNS, which can be thought of as a giant phone book of the Internet, one that is updated every second of every day. DNS guides your web browser requests to the actual server that holds the website by translating on the fly the cryptic IP numbers of the Internet into more user friendly versions. As an example, 64.203.107.148 is the IP of michelebachmann.com, but that IP is instantly translated to michelebachmann.com by DNS for less confusion on the part of us humans.
To determine where and what type of web servers hold the website and deliver the webpages, one looks for the “name servers” in the DNS records. Those indicate the actual computer servers that handle the websites. These servers can be one computer in a rack deep in a high rise building, or can be many different computers scattered in server farms around the world.
Romney and Paul use “cloud hosting,” which means their web servers are actually many computers, spread out across the Internet in server farms order to get the fastest webpage downloads for all viewers no matter their physical location. Being “in the cloud” is the latest and best thing for websites, because the cloud is everywhere and not tied down to one computer and/or one physical location for it and as a result vulnerable to failure.
Romney and Paul’s sites use cloud servers from Amazon.com. Amazon’s cloud service is a spin-off business of the huge Internet infrastructure the online retailing giant built for themselves, which rivals anyone else in the world. Amazon started this service a few years back, and using such a enormous infrastructure for websites pretty much guarantees them standing up to service outages and even severe hacker attacks.
Bachmann’s site (as well as Cain’s) doesn’t appear to be in the cloud but resides on one or more machines at GoDaddy, based in Chicago. That’s Bob Parsons of the “pose with the elephant I shot in Africa” and “GoDaddy Girl” R-rated web ad schools of PR, also known for cheap but not very reliable web hosting. But Bachmann and Cain are using much stouter and faster grades of servers than the usual GoDaddy website.
Perry’s servers are based in Texas at a big hosting reseller called Rackspace, as are Gingrich’s.
Lastly, the actual domain registrations are interesting. Once again, this involves DNS; the Domain Name System also contains the information on who owns a particular domain, and that info is publicly assessable. Domain information is public information, much like a phone book of the Internet, but sometimes you run up against unlisted numbers.
Romney, Huntsman, Cain, Gingrich and Paul use private registration services through domainsbyproxy.com and GoDaddy, which means they can hide via a proxy service the actual person or organization who owns the domain. But michelebachmann.com is registered by Dave Bachmann in Stillwater, Minnesota; might this be Marcus David Bachmann, her husband? And rickperry.com is registered by Jordan Root of the Texas Governor’s Office.
Another tidbit from DNS: the latest news about Palin and Rove snipping at each other about her possible candidacy is interesting in looking at the domain registrations: the sarahpac.com domain is registered by Campaign Solutions, “A full-service online consulting firm” with Bachmann as a client, among other Republicans, such as John McCain.
And the sarahpac.com site itself is developed by Upstream Communications, who does Rove’s site, as well as many other Republicans.
How does Obama’s site stack up to all this? Nothing fancy in the source code: no sniping, no goofs. As of 11/23/11, his Facebook – barackobama page has 22,000,000 24,144,855 Facebook “Likes”. Twitter followers number 9,890,000 11,240,000 for the BarackObama Twitter campaign account and 2,370,000 2,512,600 for the whitehouse Twitter account.
Obama’s site appears to be a custom Content Management System, giving (at least to me) no firm clues as to its pedigree. (But whitehouse.gov is Drupal). Domain registration for barackobama.com belongs to Obama for America, located in Chicago, Illinois. The site appears to use GoDaddy servers, as well as other servers – one creatively named “bostatic.com” – to handle the load of static resources like images that don’t change much and can be being served from a different set of computers for better website speeds.
Very lastly: Perry has his own Flickr photostream at Flickr: Governor Rick Perry’s Photostream. Another case of being out in front and “presidential?” No other candidate has their own Flickr stream. But Obama does: Flickr: Barack Obama’s Photostream.
That’s that – a geek’s look at the candidate’s sites. The politics of the race – and maybe even the code – will only get more interesting from here on out.