Navigation

Browser configuration

You have to configure your web browser to use WebCleaner as a proxy.

Netscape/Mozilla

Select Edit -> Preferences -> Advanced -> Proxies.
Activate Manual proxy configuration.
Under HTTP Proxy enter localhost, the Port is 8080.
Under HTTPS Proxy enter localhost, the Port is 8080.
Under No Proxy for enter localhost, 127.0.0.1.
Click Ok to use your new settings.

Firefox

Select Edit -> Preferences -> General -> Connection Settings.
Activate Manual proxy configuration.
Under HTTP Proxy enter localhost, the Port is 8080.
Under SSL Proxy enter localhost, the Port is 8080.
Under No Proxy for enter localhost, 127.0.0.1.
Click Ok to use your new settings.

Internet Explorer

Select Tools -> Internet Options -> Connections.
Click on LAN Settings. If you have a dialup connection to the internet, select your dialup connection and click on Settings.
Activate Use a proxy server.
If activated, deactivate Bypass proxy server for local addresses.
Click on Advanced.
Under HTTP enter localhost, the Port is 8080.
Under Secure enter localhost, the Port is 8080.
Click Ok to use your new settings.

Opera 8

Select Tools -> Preferences -> Advanced -> Network -> Proxy servers.
Activate HTTP and enter localhost, the Port is 8080.
Activate HTTPS and enter localhost, the Port is 8080.
Activate Enable HTTP 1.1 for proxies
Activate Do not use proxy on the adresses below and enter localhost, 127.0.0.1.
Click Ok to use your new settings.

Konqueror (KDE)

Select Settings -> Configure Konqueror -> Proxy.
Activate Manually specify the proxy settings and select its Setup. In the new windows enter localhost as hostname and 8080 as port number both for HTTP and HTTPS.
Under Exceptions add both localhost and 127.0.0.1 with the New button.

Proxy filter modules

WebCleaner uses a modular filter design allowing a lot of flexibility for different uses.
Each module has a list if mime types and a list of which parts of request/response challenge it applies to. And each module can be further customized by separate rules in the filter configuration.

Name Description Requirements Configuration rules
BinaryCharFilter Replace illegal binary characters in HTML code like the quote chars often found in Microsoft pages.
MIME types: text/html
HTTP stages: response content body
None
Blocker Block or allow specific sites by URL name. Before matching a URL the hostname and path is unquoted to avoid spoofing attacks.
MIME types: all
HTTP stages: request URL
Block, Allow
Compress Compression of documents with good compression ratio like HTML, WAV, etc.
MIME types: text/*, application/postscript, application/pdf, application/x-dvi, audio/basic, audio/midi, audio/x-wav, image/x-portable-*map, x-world/x-vrml
HTTP stages: response content body
None
GifImage Deanimates GIFs and removes all unwanted GIF image extensions (for example GIF comments).
MIME types: image/gif
HTTP stages: response content body
None
Header Add, modify and delete HTTP headers of request and response.
MIME types: all
HTTP stages: request and response headers
Header
HtmlRewriter Parse HTML code and rewrite single tags, attributes and values. Execute and filter JavaScript. Parse and filter content rated pages. Filter HTML comments.
MIME types: text/html
HTTP stages: response content body
Javascript, Nocomments, Rating, Htmlrewrite
Name Description Requirements Configuration rules
ImageReducer Convert images to low quality JPEG files to reduce bandwidth
Software: the Python Image Library (PIL) must be installed.
MIME types: all image types supported by the Python Imaging Library (as of version 1.1.5: jpeg, png, gif, bmp, pcx, tiff, xbm, xpm)
HTTP stages: response content body
None
ImageSize Remove images with certain width and/or height.
Software: the Python Image Library (PIL) must be installed.
MIME types: all image types supported by the Python Imaging Library (as of version 1.1.5: jpeg, png, gif, bmp, pcx, tiff, xbm, xpm)
HTTP stages: response content body
Image
Rating Parse and evaluate content rating data.
MIME types: all
HTTP stages: response headers
Rating
Replacer Replace regular expressions in data streams.
MIME types: text/html, (text|application)/javascript
HTTP stages: response content body
Replace
VirusFilter Scan all data with the ClamAv virus scanner. For performance reasons there is a maximum size of 4 MB. If an object exceeds that size the proxy gives an error.
Software: the ClamAV virus scanner must be installed on the proxy host.
MIME types: text/html
HTTP stages: response content body
Antivirus
XmlRewriter Parse XML code and rewrite single tags, attributes and values. Plus there is the ability to filter embedded HTML content, often occuring in RSS feeds.
MIME types: text/html
HTTP stages: response content body
Htmlrewrite, Xmlrewrite

Filter configuration rules

Htmlrewrite

Matching

A HTML rewrite rule applies to one specified HTML tag and can replace (or delete if the replacement data is empty) parts of or the complete tag. The tag name is a case insensitive string.
If attributes are given, they must match too before the rule applies.

Action

If there is no replacement given the specified tag part will be removed, else it will be replaced.
Back references to matched subgroups can be specified in the replacement string with a backslash and the subgroup number (ie. \1, \2, ...).

  What it does when replacement is foo
replace part before after
tag <blink>text</blink> footextfoo
tagname <blink>text</blink> <foo>text</foo>
enclosed <blink>text</blink> <blink>foo</blink>
attr <a href="bla">..</a> <a foo>..</a>
attrval <a href="bla">..</a> <a href="foo">..</a>
complete <a href="bla">..</a> foo

If you specified zero or more than one attributes to match, 'attr' and 'attrvalue' replace the first occuring or matching attribute or nothing.

Xmlrewrite

Selector

An XML rewrite rule applies to one specific XML tag and can replace (or delete) parts of or the complete tag. The selector is a simplified XPath expression of the form (/tag)+ where a tag is of the form name([attr=val(,attr=val)*])?. Tag names, attributes and values are case sensitive.
Example: /rss/channel/item/description selects the <description> XML tag in an RSS new feed.

Action

  Defined replacement types
replace type replace value action
rsshtml unused Assumes all text content inside the XML tag is HTML. Only allows certain HTML tags, and filters the HTML data with the Htmlrewrite rules.
remove unused Removes the complete selected XML tag and its content.

Replace

Replace regular expressions in HTML or JavaScript pages.

Block

A block rule specifies regular expressions for urls which must be blocked.
The replacement URL specifies the URL to show when the block matches. If none is given a default block message is shown.
Back references to matched subgroups can be specified in the replacement url with a backslash and the subgroup number (ie. \1, \2, ...).

Blockdomains

Block a list of domains. The domain list is stored in an extern compressed file.

Blockurls

Block a list of URLs. The URL list is stored in an extern compressed file.

Allow

An allow rule specifies regular expressions for urls which must be allowed, even if a matching block rule exists.

Allowdomains

Allow a list of domains. The domain list is stored in an extern compressed file.

Allowurls

Allow a list of URLs. The URL list is stored in an extern compressed file.

Header

Modify HTTP headers. If the replacement value is empty, the header is deleted, else it gets replaced or added if it did not exist before.

Image

Block images with a certain size by replacing them with a transparent 1x1 image.

Javascript

Execute and filter JavaScript (JS) in HTML pages using the integrated Spidermonkey JS engine. The filter deletes popups and places dynamic content emitted with document.write() into the HTML file.

Nocomments

Remove comments from HTML source. Comments inside <script> or <style> tags are not removed.

Rating

One activated Rating rule enables the content rating system in WebCleaner. Several distinct content rating services including the one defined by WebCleaner itself can be configured.

Antivirus

One activated Antivirus rule enables the virus filtering for the VirusFilter module.

Footer... π