
Blog Post
SEO

Nadine
Wolff
published on:
18.03.2014
WDF*IDF – Old wine in new bottles?
Table of Contents
The WDF*IDF formula describes the significance of a term within a document in relation to a certain number of documents. But what's new about this formula? Quite simply: More precise term weighting!
The abbreviation WDF stands for the term 'within-document-frequency' and counts how often a term appears in the text. This factor considers all other words of a text—similarly to the previous keyword density. The determination of keyword density with the WDF factor is calculated by a logarithm that sets the value. The formula puts a term in relation to how often it appears in a text. So far, everything is well-known and old hat.
The acronym IDF is the new and innovative aspect of this mathematical magic formula. IDF stands for the words 'inverse document frequency' and includes all relevant documents appearing in the database. The more documents there are, the smaller the IDF factor. Factors like information architecture, the number of external links, visitor numbers, and authorship as a trustworthy source play a role here, along with keyword density. The document is thus always set in relation to the indexed documents with the same keyword. The more relevant pages there are online, the harder it is for the document to rank in the SERPs. The novelty in these supposedly old matters lies exactly in this IDF factor. But how new is this approach really?
The Mathematical Formula for Content – The Theory of Measuring the Web World
The WDF*IDF formula is one that mathematically tries to establish a relationship between the keywords of a text and the existing documents on the web. The content of this mathematical equation: The ranking of a text does not depend solely on the quality of the text (WDF) but also considers the existing search results for the keyword (IDF) in the SERPs.
Imagine writing a text about 'tadpoles'. This text contains high-quality content with added value for the user and includes relevant secondary keywords and good information architecture alongside the main keywords. As a result, this tadpole text would rank better than a text about, for example, a 'Brigitte diet' with equally high quality. Why is that?
A reason for the better ranking: The different competition situation in the SERPs! The search for these example topics in Google differs in that the indexed documents in the search results for 'tadpoles' (148,000) are significantly fewer than those for 'Brigitte diet' (1,080,000).
The Keyword Cannibalism behind the WDF*IDF Formula
But can this WDF*IDF formula really be the ultimate wisdom? As already established, it is old wine in new bottles. The SEO experts are divided on this.
While today there are tools like wdfidf-tool.com, as early as 2002, the 'keyword density' was already being considered a 'relatively useless metric', according to one of the most well-known international SEO experts and MOZ founder Rand Fishkin. Why is this formula for keyword density 'useless'? Because it chases after an artificial formula without keeping the obvious—the user—at the center of focus. And, as is often the case with higher mathematics, there is also the variable 'X' here; for instance, relations can constantly change, thus the text is continuously modified and optimized according to the formula.
However, what is truly new is: Google is smarter today than it was back then. The Google algorithm Hummingbird has been around since the fall of 2013, and it no longer just reads the text of a page with the respective keywords and relevant secondary keywords, but it also subjects them to a relevance analysis between the search query and search result simultaneously. The Hummingbird algorithm no longer merely considers individual words of the search query but analyzes the entire sentence of a search query and filters the search results accordingly. This affects about 90% of organic search queries.
The Subtle Difference – User Experience!
Summa summarum, this magic formula has once again raised the bar for a good website towards quality content. But only those who can produce quality content in terms of user experience will be at the forefront in the SEO year 2014. In the SEO year 2014, it is no longer about exact keywords but rather about interesting topics and relevant facts, even words that are not keywords are important here. Thus, the motto 'Back to the roots' applies to ranking in the SERPs, focusing first on the 'desire' of the user to click on the respective search result. For the content, it is then about transforming this cultivated interest through content with added value into a positive user experience—through unique content and a unique page.
If you want to have your content or website scrutinized from a marketing strategy perspective, we are your right partner. Contact us!

Nadine
Wolff
As a long-time expert in SEO (and web analytics), Nadine Wolff has been working with internetwarriors since 2015. She leads the SEO & Web Analytics team and is passionate about all the (sometimes quirky) innovations from Google and the other major search engines. In the SEO field, Nadine has published articles in Website Boosting and looks forward to professional workshops and sustainable organic exchanges.