The Polipo Manual: Censor Accept-Language

3.3.2.1 Why censor Accept-Language

Recent versions of HTTP include a mechanism known as content negotiation which allows a user-agent and a server to negotiate the best representation (instance) for a given resource. For example, a server that provides both PNG and GIF versions of an image will serve the PNG version to user-agents that support PNG, and the GIF version to Internet Explorer.

Content negotiation requires that a client should send with every single request a number of headers specifying the user’s cultural and technical preferences. Most of these headers do not expose sensitive information (who cares whether your browser supports PNG?). The ‘Accept-Language’ header, however, is meant to convey the user’s linguistic preferences. In some cases, this information is sufficient to pinpoint with great precision the user’s origins and even his political or religious opinions; think, for example, of the implications of sending ‘Accept-Language: yi’ or ‘ar_PS’.

At any rate, ‘Accept-Language’ is not useful. Its design is based on the assumption that language is merely another representation for the same information, and ‘Accept-Language’ simply carries a prioritised list of languages, which is not enough to usefully describe a literate user’s preferences. A typical French user, for example, will prefer an English-language original to a French (mis-)translation, while still wanting to see French language texts when they are original. Such a situation cannot be described by the simple-minded ‘Accept-Language’ header.