Juliusz Chroboczek, unfortunately, in his copious free time.
If the fine manual and this FAQ didn't solve your problem, and you have checked the list of known bugs in Polipo feel free to ask on the Polipo-users mailing list.
As this list carries fairly moderate traffic (usually between 0 and 3 messages a day, with occasional bursts of activity), you should feel free to subscribe. You are welcome to send mail to the list even if you're not subscribed, in which case you should mention that you want to be CC'd with replies.
You may browse the list without subscribing from SourceForge by HTTP (slow and unreliable), from Gmane by HTTP (faster) and from Gmane by NNTP (even faster, but then, I'm using a smart newreader and a poor web browser).
You should not contact the developers personally, unless you have good reasons to want your query to remain confidential. If you do, please make sure to include the word ‘‘polipo’’ somewhere in the subject line.
You can either send mail to the Polipo-users mailing list (no subscription required) or submit an issue to the issue tracker. If you do both, please mention the issue number in your mail.
Polipo is developed on recent versions of Linux (with glibc) and tested with both gcc and clang.
Polipo is fairly portable C89 code, and should work on any POSIX-ish system built in the last 30 years. Older versions of Polipo have sucessfully tested on at least the following systems:
If you're lucky, you might find a Windows binary in my download area.
Polipo has been reported to be slowish under Cygwin — please use
the native Windows port instead. Under SVR4 and native Windows,
Polipo has a small memory leak due to a deficiency in
the putenv
library function. Please use a Free Unix
system if possible.
Polipo should be easy to port to any 32- or 64-bit machine with
a Unix-like system, a half-decent C89 or C99 compiler, and either
the poll
or select
system call. The
system calls writev
and readv
are nice
to have, but not strictly necessary.
Yes, Polipo for Windows usually requires some manual configuration. Please see this document for more information.
You might also find useful hints in this thread.
Any compliant HTTP/1.0 or HTTP/1.1 client that can speak to an HTTP proxy should be fine. This includes every browser known to me.
Polipo aims at being a compliant HTTP/1.1 proxy. It should work with any web site that complies with either HTTP/1.1 or the older HTTP/1.0. (Polipo does not support the long-obsolete HTTP/0.9.)
Since a large number of web sites do not comply with relevant standards, Polipo includes a number of workarounds for broken sites, and can optionally be configured to include even more of those at a slight cost in performance.
A well-tuned web site will send replies that contain hints for a cache to fine-tune their behaviour. Please see this caching tutorial if you want to make your web site cooperate with Polipo and other standards-compliant caches.
Polipo is Free and OpenSource Software, and comes with absolutely no strings attached, not even the rather mild conditions required by typical GNU-style software or the even milder conditions required by the older (four-pronged) BSD license. Please read the Polipo distribution conditions and see for yourself.
Polipo is already running.
If the local webserver (http://localhost:8123/) works fine, but polipo just hangs when accessing remote hosts, there might be a DNS problem. Try running
$ polipo dnsUseGethostbyname=true
If that works around the problem, read DNS in the Polipo manual for more information about what's going on.
Usually, Polipo will say something like
DNS: recv failed: Connection refused (111)
Falling back to using system resolver.
This means that Polipo couldn't contact a name server. Please
point Polipo at a working recursive name server by editing
/etc/resolv.conf
, or, if that is impossible, by setting the
variable dnsNameServer
. Please see
DNS in the Polipo manual for more
information.
Skey stop working after installing Polipo for Windows?
Some users report that after installing Polipo for Windows,
their S
key stops working until they reboot the computer.
We've been unable to duplicate the issue. It appears to be related to
the version of the Nullsoft Installer and some combination of
installed software in the system. If you have this issue, please
install the Polipo binary manually.
Either the server is buggy, and it speaks HTTP incorrectly, or polipo is buggy, and it cannot parse perfectly good HTTP. Polipo will ignore the incorrect header, which should work around the issue.
Was that an FTP URL? Polipo is an HTTP proxy that can tunnel HTTPS, it's not an FTP or FTP-over-HTTP proxy.
Reconfigure your browser so that it doesn't use a proxy for FTP connections.
That's normal. A resource was decorated with metadata that causes compliant HTTP/1.1 proxies (including Polipo) to request it every time. Polipo is logging the URL in case you want to include it in your forbidden URLs file.
That's normal. Polipo decided to pipeline requests to a given server, and it later turned out that this was a bad decision. Polipo is recovering.
That's okay. The server is violating RFC 2616, Section 10.3.5. Polipo will work around the problem.
That's okay. Polipo tried to find out whether an object had been superseded (to revalidate it using an if-modified-since request) and the server replied with the full object rather than just informing Polipo there was no change. If the server keeps doing that, Polipo will switch to using a different, slightly slower validation method (HEAD revalidation).
That's sort of okay. The server is sending a combination of headers that doesn't make sense but that is not explicitly forbidden by RFC 2616. Polipo will make a reasonable guess about the meaning of the server's reply (it will assume that the reply is not persistent).
The most common cause for such issues is a site that provides incorrect cache control information, and hence causes Polipo to serve stale data to the client.
You can work around most such issues by
setting dontCacheRedirects
and dontCacheCookies
to
true, and creating a file ~/.polipo-uncachable (or whatever
you set uncachableFile
to) with the following contents
\.(php[345]?|[sp]html|cgi|pl|py|[aj]sp)$
\?
/cgi-bin/
Note that doing this will slow down Polipo quite a bit.
(Thanks to hondza for providing this answer.)
Yes.
No.
Yes. SOCKS4a and SOCKS5 with hostnames are supported. Please check
the variables socksParentProxy
and socksProxyType
.
Polipo is transparent if you set the following in your config file:
maxAge = 0
maxExpiresAge = 0
But that's probably not what you meant — please see the next question.
No.
Interception proxying (sometimes confusingly called ‘‘transparent’’ proxying) is a technique that intercepts client connections at the network layer in order to redirect them at an application layer proxy.
Interception proxying is a fundamentally broken design (see for example this posting and RFC 3143, Section 2.2.2), and will not be supported by Polipo. If you want to use interception proxying in order to avoid manually configuring your clients, please ask your browser vendor to provide a proper protocol for client auto-configuration. If you want to use interception proxying for any other reason, you're probably doing something wrong.
(Or you're a fascist pig with a read-only mind.)
There's no such thing as an anonymising proxy (but see below about tor). Some proxies, however, have some features that make client identification somewhat less precise.
Every server that you access can find out the IP address of the machine where the connection comes from. When you use a proxy, this is the IP address of the machine running the proxy; thus, a shared proxy makes it slightly more difficult to find out which client is accessing a given server.
Polipo does not use the non-standard X-Forwarded-For
header that gives the client's address out. By default, Polipo does not
include the Via
header that tells servers the name of every
proxy being used (but check the disableVia
variable).
There are, however, many other elements of the HTTP/HTML suite that
give up information about you. The most obvious are HTTP headers,
notably cookies, Accept-Language
and User-Agent
;
all of these can be censored by
Polipo.
The Javascript client side scripting language can also be used to disclose information about the client; the only solution is to disable Javascript in your browser.
All of the tweaks suggested above will break some sites. And remember: no matter what measures you take, you will not be anonymous; always assume that your local law enforcement agency, your boss, your significant other and your mother know which sites you have been visiting.
Yes. In order to get the privacy enhancements of Privoxy and much (but not all) of the performance of Polipo, you should put Polipo upstream of Privoxy.
In other words, you should:
localhost:8118
);forward / localhost:8123
in
the Privoxy config file);(Tor is a volunteer-run network of anonymising proxies.)
Yes. Set socksParentProxy
to localhost:9050
.
See also Running tor with Polipo.
It depends on what is your threat model and how you configure Polipo.
By default, Polipo allows anyone on the allowedClients
network to connect and access the configuration interface and the list
of cached pages. You can close that loophole by setting
disableLocalInterface
.
By default, access to Polipo is only allowed from the local
machine. If you change allowedClients
to allow remote
access, Polipo relies on your routers to prevent address spoofing.
Setting authCredentials
does not improve security if
you don't control your routers, as HTTP Basic security is
vulnerable to sniffing. I myself leave allowedClients
at the
default value and use ssh tunnels when sharing proxies.
A serious security bug was found in Polipo 0.9.8. This bug could only be exploited by people who were allowed to access the proxy. This has been fixed in 0.9.9.
During its history, four buffer overflows have been found (and
fixed) in Polipo. One was only present on obsolete systems (with the
old definition of
snprintf
),
the other three were buffer overflows while reading, and therefore most
probably impossible to exploit.
Any filesystem that is able to efficiently handle large numbers of small files should do.
Under Linux, reiserfs/tails provides the best space/time
compromise. Other good choices are ext4, Reiserfs/notails and XFS.
Avoid ext2 and ext3 without hashed directories. Make sure that the
filesystem is mounted with the relatime
option, or, even
better, noatime
(but make sure you know what this option
implies).
Under BSD Unix, FFS (UFS) with small frags has good space usage, but access time of large directories is pretty bad. FreeBSD's hashed directories only partially solve the problem.
The currently fashionable copy-on-write filesystems (ZFS, btrfs, f2fs) should provide excellent performance, but I haven't tested them myself.
No idea about Windows.
This should in principle work with NFSv3 except if your NFS
implementation is buggy (but see below about Linux). If you run the
proxy (polipo
) and the expiry process
(polipo -x
) on different hosts, you might (if your
LAN is very noisy, your NFS implementation very primitive, and you're
very unlucky) get I/O errors and broken connections, but no data
corruption should happen.
NFSv2 is definitely not safe, unless both the proxy and the expiry process are run on the same host.
NFSv3 is not safe if the NFS client is running a Linux version earlier than 2.6.5.
More generally, any filesystem should work as long as:
open(O_CREAT|O_EXCL)
works reliably, and
If you use a filesystem that does not maintain last-modified time
correctly, you might want to set the variable preciseExpiry
to true
. This will make expiry much slower.
Just copy the contents of /var/cache/polipo
. You
don't need to preserve atimes, but you should preserve mtimes. You
should send Polipo a USR2
signal both before and after
you perform the copy.
You can also merge two caches by simply copying one over the other.
That's not what Polipo has been designed for, but it turns out to work reasonably well.
The cache will not become corrupted (except if you're running NFSv2). If two polipi try to access the same object at the same time, they will complain loudly and fail to save the new data on disk.
Polipo doesn't like your system clock to change by more than a few seconds. If you ever need to step your system clock, you may want to stop Polipo before the change and restart it afterwards.
If you step the system clock backwards by a large amount,
polipo will become confused about the dates of the files in the disk
cache and start serving stale data to clients. You can work around
this problem either by manually purging the on-disk cache
(rm -r
), or by shift-clicking
reload in your browser. (Stepping the clock forwards
does not have this problem.)
All of the algorithms that polipo uses are safe with respect to clock skew between proxy and server. In other words, if your system clock is wildly off (but you don't step it), polipo will react by fetching more data than necessary from the network, never by serving stale data.
You can avoid the inconveniences described above by using an NTP client to keep the system time accurate. I'm running ntpd on my servers and desktop machines, and chrony on my laptops.
The default configuration of Polipo is carefully tuned to balance size and speed. The simplest way to make Polipo faster is to give it more memory; see Memory usage in the manual for information on doing that.
If you're using a lot of regular expressions in
your /etc/forbidden
file — don't do that. Using domains is
okay, although even that is not as fast as I'd like.
If you're on a fast network, you may also improve Polipo's I/O
performance by recompiling it with a larger CHUNK_SIZE
.
$ make EXTRA_DEFINES="-DCHUNK_SIZE=8192"
The default value is 4096 on 32-bit architectures, and 8192 on 64-bit ones. 8192 and 16384 are good values (there's hardly any benefit beyond that). Note that doing that decreases Polipo's ability to allocate memory in a flexible manner, and you should increase Polipo's chunk memory to compensate.
If your Polipo is limited by the speed of the disk, you may be able
to make it feel faster by playing with the value of
idleTime
. See
Asynchronous
Writing in the manual for more information.
If you are limited by the speed of the network, you may get
Polipo's cache to be more effective by serving stale data; if you do
that, you will sometimes need to hit the reload to see the
fresh contents of a page. See the
description of the variables
cacheIsShared
, relaxTranparency
and mindlesslyCacheVary
in the manual.
The simplest way to decrease Polipo's memory usage is to give it less memory; please see Memory usage in the manual for information on doing that.
If you decrease Polipo's chunk memory, you may want to recompile it with
a smaller CHUNK_SIZE
.
$ make EXTRA_DEFINES="-DCHUNK_SIZE=2048"
2048 is a good value. (While Polipo will work with 1024 and even 512 byte chunks, I don't recommend going beneath 2048.)
If you're using a lot of regular expressions in
your /etc/forbidden
file — don't do that.
Polipo is fairly small out of the box. If you're building a single-floppy system using Polipo, or burning Polipo into a router's ROM, you might want to go to some extra effort in order to make Polipo's binary smaller.
Of course, you will want to run strip
(1) on the Polipo
binary.
Many of Polipo's features can be compiled out if not needed; please see the Makefile for details.
You may also want to recompile Polipo with assertions disabled by
defining the macro NDEBUG
.
For example, if you're using gcc, you might want to say
$ make CDEBUGFLAGS="-Os -Wall" EXTRA_DEFINES="-DNDEBUG -DNO_IPv6 -DNO_STANDARD_RESOLVER -DNO_REDIRECTOR -DNO_FORBIDDEN" all
You might also want to compress the resulting binary using
something like
upx
, but only do
that if you understand its effect on virtual memory (swap) usage.
Most of the answers in this section only apply to Unix systems. Windows-specific contributions are welcome.
$ du /var/cache/polipo/ | sort -n | tail
You need to send polipo a request for the object with a
Cache-Control: no-cache
header.
With Netscape/Mozilla: go to the page and hit shift-reload.
With other tools: make sure that http_proxy
is set and use
one of the following:
$ curl -I -H 'Cache-Control: no-cache' http://... > /dev/null
$ wget --header='Cache-Control: no-cache' -O /dev/null http://...
$ squidclient -s -p 8123 -r http://...
$ killall -USR1 polipo
$ rm -r /var/cache/polipo/www.microsoft.com/
$ killall -USR2 polipo
There is currently no automated way of performing this operation. You can do it by hand, by identifying and removing the relevant file in the cache:
$ killall -USR1 polipo
$ grep -il '^X-Polipo-Location: http://www.pps.jussieu.fr/~jch/software/polipo/^M' /var/cache/polipo/www.pps.jussieu.fr/*
/var/cache/polipo/www.pps.jussieu.fr/pA2oquORVPZEXdYJ7cXWOQ==
$ rm /var/cache/polipo/www.pps.jussieu.fr/pA2oquORVPZEXdYJ7cXWOQ==
$ killall -USR2 polipo
Type ^V^M
in order to get a ^M
onto the command line.
If you've got an IPv6-only network, the most convenient solution to get access to the IPv4 Internet would be to set-up routing through a NAT box, either natively (that's what we do on our mesh network) or else using a set of IPv4-in-IPv6 tunnels (or an IPv4-over-IPv6 VPN).
A more pedestrian solution is to get your TCP client applications to tunnel through an instance of Polipo running on a double-stack host.
Install Polipo on a double-stack host. Set proxyAddress
to ::
, then configure both allowedClients
and
your firewall suitably. Do not use authCredentials
— it is insecure, and should not be used except in very particular
situations.
You should then make sure that tunnelAllowedPorts
includes
at least 22, 443, 873 and 5223.
On every client, configure your web clients, your Jabber clients, your rsync clients, etc. to use Polipo as an HTTP proxy. For command line clients, this can be done with:
$ export http_proxy=http://polipo.example.org:8123
$ export https_proxy=http://polipo.example.org:8123
$ export RSYNC_PROXY=polipo.example.org:8123
On systems using OpenSSH, you will want to install socat
and create a script ssh-polipo
with the following contents:
#!/bin/sh
exec ssh -o 'ProxyCommand socat - PROXY:polipo.example.org:%h:%p,proxyport=8123' "$@"
If you are running an IPv6-only network, I definitely want to hear from you.
When someone publishes a fix on the Polipo-users list, it usually
comes under the form of a patch, a plain text file with the
extension ‘.patch
’ or sometimes ‘.diff
’.
A patch describes the differences between a released version of
polipo and the fixed version. Modifying the released version in order
to get the fixed version is called applying the patch, which is
done with a program called patch
.
You first need to untar the polipo sources, change to the directory
where the sources are, and then invoke patch:
$ cd polipo-0.9.4/
$ patch -p1 < ../polipo-fix.patch
If patch complains that it cannot find the file to patch, try using
-p0
instead of -p1
. If patch complains that it
cannot recognise the patch format, you're probably trying to use a SVR4
version of the patch utility with a patch in unified diff format;
please install
GNU patch
first (or upgrade to a Free Unix system).
$ git clone git://git.torproject.org/git/polipo
$ cd polipo/
$ gitk &
Alternatively, check the GitHub mirror of Polipo.
Valgrind is a (rather amazing) memory debugger for Linux. If you send me a bug report that I cannot reproduce, I may ask you to try to reproduce it under valgrind.
In order to do that, you will need to install valgrind on your system. You should then recompile Polipo with debugging:
$ make clean
$ make CDEBUGFLAGS='-g -Wall'
You should then run Polipo under valgrind:
$ valgrind ./polipo
and send me any error messages produced.
servers?
’’ display mean?In order to make sound decisions about pipelining and PMM, Polipo needs to cache a certain amount
of data about the servers it accesses; the contents of the server
cache can be displayed on http://localhost:8123/polipo/servers?
.
This information is mostly useful for people developing Polipo;
however, if you're interested, read on.
The ‘‘servers?
’’ display looks like so:
Server | Version | Persistent | Pipeline | Connections | rtt | rate | |
---|---|---|---|---|---|---|---|
www.pps.jussieu.fr | 1.1 | yes | unknown | 1/2 | 0.008 | ||
www.kde.org | 1.1 | yes | yes | 2/2 | 0.135 | 181176 | |
slashdot.org | 1.1 | no | 0/4 | 0.217 | 33681 | ||
ez.no | 1.1 | no | 0/2 | (1 lies) | 0.268 |
‘‘Server’’ is the name of the server. ‘‘Version’’ can be ‘‘1.0’’, ‘‘1.1’’ or ‘‘unknown’’, and is the version of HTTP that the server claims to speak.
‘‘Persistent’’ specifies whether the server does persistent connections, and can be one of ‘‘yes’’, ‘‘no’’ or ‘‘unknown’’. If ‘‘Version’’ is ‘‘1.1’’ and ‘‘Persistent’’ is ‘‘yes’’, then ‘‘Pipeline’’ specifies whether the server can reliably do pipelining; it can be one of ‘‘yes’’, ‘‘no’’, ‘‘unknown’’ or, if a pipelining probe is currently in progress, ‘‘probing’’.
‘‘Connections’’ specifies the number of connections to this server. It is of the form ‘‘m/n’’, where m is the number of connections currently open, and n the maximum number of connections that polipo will use when speaking to this server.
If the server failed to respond to Polipo's standard validation method
(‘‘If-Modified-Since
’’ and
‘‘If-None-Match
’’ preconditions), the server is
marked as a ‘‘liar’’, and Polipo switches to using the
‘‘HEAD
’’ method for validation. This is noted as
‘‘(n lies)’’, where n is (a linearly decaying
measure of) the number of times the server lied to a precondition.
Finally, ‘‘rtt’’ is an estimate of the time the server takes to respond to requests in seconds (the server's round-trip time), and ‘‘rate’’ is an estimate of the transfer rate from the server, in bytes per second. Both are exponentially smoothed averages, and are absent if not measured yet.
Polipo doesn't use the standard stub resolver
(gethostbyname
and getaddrinfo
) but instead
implements its own DNS resolver. There are two reasons for that:
gethostbyname
and getaddrinfo
are
blocking interfaces: using one of those would mean that the
whole of Polipo would hang while a DNS lookup is in progress;
gethostbyname
nor getaddrinfo
return the
DNS TTL, and obeying the TTL is a MUST according to RFC 2616
paragraph 15.3.