Press "Enter" to skip to content

Varnish: Normalizing / Normalising the url

Last updated on July 9, 2015

We’ve had a small issue with our installation of the Varnish Proxy Cache not working as efficiently as we hoped. This was tracked down to the fact we are using Google Adwords and Google Analytics for tracking and Google was adding query string items such as utm_source , utm_medium , utm_campaign and gclid to the URL. This caused Varnish not to cache the page (and/or treat them as separate urls) and just led to bad cache usage.

I’ve added this code to fix this which may be of use for others:

/* Normalize the url - first remove any hashtags (shouldn't make it to the server anyway, but just in case) */
if (req.url ~ "\#") {
set req.url=regsub(req.url,"\#.*$","");
/* Normalize the url - remove Google tracking urls */
if (req.url ~ "\?") {
set req.url=regsuball(req.url,"&(utm_source|utm_medium|utm_campaign|gclid)=([A-z0-9_\-]+)","");
set req.url=regsuball(req.url,"\?(utm_source|utm_medium|utm_campaign|gclid)=([A-z0-9_\-]+)","?");
set req.url=regsub(req.url,"\?&","?");
set req.url=regsub(req.url,"\?$","");

One Comment

  1. I was looking for this exact thing and stumbled across your blog. Thank you.

    I’ve shortened your VCL code a bit. Here is what I used:

    # Strip out Google Analytics campaign variables. They are only needed
    # by the javascript running on the page
    # utm_source, utm_medium, utm_campaign, gclid
    if(req.url ~ “(\?|&)(gclid|utm_[a-z]+)=”) {
    set req.url = regsuball(req.url, “(gclid|utm_[a-z]+)=[-_A-z0-9]+&?”, “”);
    set req.url = regsub(req.url, “(\?|&)$”, “”);

Comments are closed.