Drive has 700+ articles for digital transformation leaders written by StarCIO Digital Trailblazer, Isaac Sacolick. Learn more.

For those of you that want a free/easy clickstream analysis tool, have a look at StatViz. If you’re running Apache and using the standard log format then plugging in this tool is very easy.Setup

  1. Download and install GraphViz. There’s an RPM for linux…
  2. Download and install StatViz in a directory. It’s basically one php file. The README file will tell you how to customize the configuration file and run it.
  3. I don’t have too many PHP apps running so there’s a couple of other things you may need to do. First, you’ll need PEAR:Config. Once you have this, uncompress/untar it the easiest thing to do is move it Config.php and the Config dir to /usr/share/pear. Second, statviz takes up a lot of memory so you may need to increase the memory_limit configuration parameter in your /etc/php.ini

That’s pretty much it…

Basic Running

You can run it using

./statviz.php –config configfile

and then create a gif file of the output by doing something like

dot -Tgif -oOutputGifFileName InputDotFile

If you put the output gif file in a web accessible dir then you’ll be able to see it from your browser.

Things To Look For

There are a number of things you’ll need to consider if you want accurate results:

  • Make sure you look at the bot extensions and make best attempts to get these filtered out.
  • Make sure you have all non-pages (graphics, js, css) filtered out.
  • If possible, try to filter out requests from internal users. Statviz doesn’t have a filter for this, so I just scrubbed out of the logs myself using a grep -v.
  • If you’re site has long URL’s, you will most certaintly want to clean them up before processing. The tool allows you to create an alias file, but you may need/want to do some log scrubbing on your own.
  • Play around with the GraphNReferrerPairs parameter. You can get a lot more detail on site activity with higher numbers, but the graph becomes the graph then becomes a lot more complex to digest. If you decide on a large graph, you may need to modify the source and change the size of the graph. It defaults to 10, 8 and there isn’t a parameter to configure this. I changed it to 20, 16 for most of my small graphs (GraphNReferrerPairs <>) and to 40, 32 for larger graphs.
  • Very long URLs are going to be a hassle, especially if they come from external referrers and out of your control. I put in some checks in the code to clip the very long URLs.

Automating

I’ve automated a couple of things on my site:
– A report that updates hourly on today’s activity.
– I archive a daily gif file. (I will add weekly and monthly in the future).
– I have a ‘full report’ that shows activity for the last 30 days. I update this daily.

I’ll put out another entry with a quick 101 on interpretting the results.

Published on:

Topics:

Leave a Reply


StarCIO

My company, StarCIO, provides leadership, learning, and advisory programs for companies looking to accelerate delivering business value from digital transformation. Contact me if you’d like to learn more about partnering opportunities.


Isaac Sacolick

Join us for a future session of Coffee with Digital Trailblazers, where we discuss topics for aspiring transformation leaders. If you enjoy my thought leadership, please sign up for the Driving Digital Newsletter and read all about my transformation stories in Digital Trailblazer.


Coffee with Digital Trailblazers hosted by Isaac Sacolick

Digital Trailblazers! Join us Fridays at 11am ET for a live audio discussion on digital transformation topics:  innovation, product management, agile, DevOps, data governance, and more!


Join the Community of StarCIO Digital Trailblazers

About Drive

Drive Agility, Innovation, Transformation

Drive is the blog for digital transformation leaders brought to you by StarCIO and Isaac Sacolick.

Agility, Innovation, and Transformation are the three primary digital transformation core competencies that every StarCIO Digital Trailblazer must champion in their organizations. Learn more About Drive.


About the StarCIO Digital Trailblazer Community

StarCIO Digital Trailblazer Community

Revolutionizing traditional learning, networking, and advising experiences.

Visit the community


About StarCIO

StarCIO

About Isaac Sacolick

Isaac Sacolick

Author, 1,000+ articles, keynote speaker, Chief StarCIO Digital Trailblazer. Full bio


Driving Digital Newsletter

Driving Digital Newsletter

StarCIO Guides

StarCIO Agile Planning Guides

Digital Trailblazer

Digital Trailblazer by Isaac Sacolick

Driving Digital

Driving Digital by Isaac Sacolick

Driving Digital Standup

Driving Digital Standup

Coffee with Digital Trailblazers

StarCIO Coffee With Digital Trailblazers

Recognition

InfoWorld 2025 Judge
InfoWorld Technology of the Year 2024 Judge
Thinkers360 Top 10 in IT Leadership
Thinkers360 Top Agile Thought Leader
Thinkers360 Top DevOps Leader
Thinkers360 Top in Digital Transfomation
Thinkers360 Top in Analytics
Thinkers360 Top in Product Management

Discover more from StarCIO Digital Trailblazer Community

Subscribe now to keep reading and get access to the full archive.

Continue reading