Like a lot of folks in the SEO world, I’ve been looking at sites which were “hit” by Google’s Penguin update.
At this point, I have done a detailed analysis of the first 7 cases (involving a total of 9 sites) on my list. In six of the seven, I have found what I could call “obvious” on site issues. Things like:
- Hidden text. Not necessarily “manipulative” but still, hidden text. 3 of 6 positive cases have hidden text that was easy for me to find, in no case was it clearly “manipulative” but it was clearly hidden.
- Alt and title attribute abuse – keywords stuffed into the alt and title attribute of images, and in the title attribute of anchor tags (links).
- “Holy cow” copy at the bottom of the home page and/or other pages, up to and including all pages. By “holy cow” I mean lengthy blocks of copy, stuffed with keywords and links, often in smaller type, gray text, etc.
- Hidden or disguised links – as in, the only way to know it’s a link is to mouse over the text.
- Blogs and “article” sections full of terrible content, stuffed with keyword links – where the links may or may not be disguised.
Since all of these things should be cleaned up regardless of whether the Penguin even cares about them, it’s been easy for me to advise the owners of those sites to clean that mess up. If you’re spamming, don’t wait for Google to tell you about it.
This leaves me (so far) with 3 very interesting cases…
- Two cases of “twin” sites – same owner, one site is “hit” by Penguin, and the other is not.
- One case where the site appears, on the surface, to be a “false positive” in terms of on site issues.
“Twin studies” are interesting in science and medicine, because studying twins (where you know the DNA is the same) helps distinguish between what is caused by genetics (nature) and what is caused by environmental factors (nurture).
In the case of the Penguin update, “twin sites” may tell us a lot about what’s going on, and what has changed in the environment. So let’s take these one at a time…
Case #1: The Hidden Text Case
Pretty simple – they have two sites in slightly different niches of the same larger market. The sites are linked together.
One of the sites has a significant amount of hidden text in the template, the other does not. The site with the hidden text lost significant traffic and rankings on April 24/25, the site without hidden text did not. The hidden text in this case involves the use of CSS to drop an image on top of the text.
This hints that hidden text is a bad idea. As if we needed a hint.
Case #2: Gray-on-Gray vs. Black, White, and Blue all over.
Similar to case #1 – two sites, same market, different niches. The sites are not linked together but the domain registration info is identical as is the physical address on each site, so it’s no secret that they are owned by the same company.
One of the two sites has “holy cow” copy at the bottom of the home page, text is in gray on a (lighter shade of) gray background. #666666 on #cccccc if you care. Links stuffed into the holy cow copy, are same color as the rest of the text, not underlined, the only way that a normal human visitor would know it was a link is by mousing over it. This site lost just under 40% of its referral traffic from Google at the Penguin update.
The second site has very similar “holy cow” copy, but the text is in black (#000000) on a white background (#ffffff), and the links are unstyled – that is to say, the links are blue and underlined, and obviously links. This site is up 3% since Penguin, but a sampling of rankings indicates no change, so this is simply a natural seasonal increase which both sites would have probably experienced. That is, if Penguin had not occurred.
This hints that disguising links is bad. As if we needed that hint.
Case #3: The prolific link spammer…
I mention this one because I simply haven’t been able to find any big issues with the site itself. It is possible that they are cloaking, and not telling me about it, because people sometimes do stupid things and lie to their doctor about what they have done. This site’s Google organic referrals are down by about 26% in the post-Penguin week vs. the week prior*. It’s very likely that I’ll find something “on site” once I dig a little deeper.
This site does have a pretty aggressive inbound linking profile – that is to say, a ridiculously implausible number of links coming in with anchor text precisely matching the queries where they were most affected. Whether this means that they have been “penalized” for over-optimizing their inbound links, or simply that part of Penguin is giving such links less weight in ranking, we can’t say. The latter seems more likely.
The truth is that we don’t even know if inbound links have anything to do with the Penguin thing. I know what you are reading out there. I read it too. Most of what’s being written is simply one person parroting what another person said.
The best bit of evidence about inbound links is inconclusive but highly informative, once you understand who the data was collected from, and how. Those who spam aggressively off site are likely spamming aggressively on site – correlation is not causation – tails do not wag dogs.
That doesn’t mean the data from Micro Site Masters is useless – far from it, it’s a strong hint that we should look at more cases like case #3 – and it gives us better ideas on hypotheses to test.
Sorry, no “miracle cure” today…
Anyway – sorry I don’t have a “7 steps to fix your problem” ready just yet. I don’t think it would be particularly responsible to post something like that, when we are all still trying to work out what’s happening.
However, given the clear number of positive cases where there was pretty obvious spam on the site, those who are affected might do well to consider whether their sites really represent “best practices,” or “what we thought we could get away with.”
We’ll have more for you soon – thanks for reading!
— Dan Thies
* At any change event, you need 7 days’ worth of data to have sufficient confidence about which search terms have been affected. We often see little bumps and bounces in rankings and search volume throughout the week, and it’s far more productive to analyze SERPs that we know have changed.
UPDATE: Just to be very clear – although I’m not convinced that inbound link spam is a factor in Penguin, it’s almost certainly a good idea to clean it up. If you’re going to clean up a bunch of link spam, document what you do – the links you were able to remove, the links you couldn’t remove and the reason why, etc. If you end up submitting a reconsideration request, it will go a lot better, and a lot faster, if you can provide a spreadsheet detailing what you’ve done.