Summary

Web Analytics Tutorial

 

Appendix A – Making Reports More Usable

IN THIS APPENDIX
* Matching Expectations
   Web Server Configuration
   Adjusting Definitions to Improve Metrics
* Matching Business Organization
* Interfacing to Content Delivery Systems

Interfacing to Content Delivery Systems

Some web sites are implemented with content delivery systems that make all requests run through a script or dynamic page. This can make your report results very limited. As a simple example, assume a site has a dynamic page, /navigate.asp that formats all the content before delivering it to the user. A typical stream of requests would then look like this:

  /navigate.asp?section=home&page=index.xml
  /images/logo.gif
  /images/banner.gif
  /images/home.gif
  /navigate.asp?section=products&page=index.xml
  /images/products.gif
  /navigate.asp?section=products&page=AS001.xml
  /images/products/AS001.gif

Figure 6. Sample Page Request Report
Figure 6. The Page Requests report only
shows the wrapper page by default.
Under Summary’s default configuration, the Page Requests report would look like Figure 6 – all requests are to the single page /navigate.asp. This does not tell you anything about how the site was used. You can easily change this by enabling the “Include query string in requests– setting on the Options configuration page. Summary will then list each of the page requests above as a separate listing in the report, showing what section and page each produced.

Figure 7. Sample Diretory Report
Figure 7. The Directory report is not very useful
when all pages are processed by a wrapper.
Even if you have query strings enabled, the Directory Report would still look like Figure 7, showing all the pages in the root directory. While this is accurate, as far as the requests look, the people reading the report would expect the last two pages requested to be in the /products/ directory. You can use Summary’s Request Aliases to change these requests to something that looks more like the report readers will expect. For this example, you would set up the alias below to convert dynamically generated requests like /navigate.asp?section=products&page=index.xml to something that resembles a directory structure, /products/index.xml (see the Summary manual for details on setting up Aliases):

 /navigate.asp?section=*&page=* /$1/$2 
------------------------------------------------------

Not all content delivery systems are so simple. In fact some of the more powerful ones such as IBM WebSphere, Vignette Content Suite and BEA WebLogic use complex request structures to allow them to delivery personalized content or offer alternate editions (e.g. different languages or wireless vs. HTML.) Summary’s Aliases support Regular Expressions, which are a very powerful tool for matching text. Using regular expressions you may be able to construct aliases to handle even these complex URLs.

Sometimes content delivery systems may add more data than you want to analyze. For example, some systems may redirect the user three or four times before settling on the location of a file (especially when they provide load balancing.) These redirections do not provide much useful information when you are analyzing your visitors’ traffic patterns. You can use filters, covered in Lesson 8 - Examining Subsets of Traffic, to remove this kind of traffic from your reports.

The content delivery system may affect other parts of the log file line than the request – status code, host, user, etc. Sometimes request streams can be more complex than the way the user sees them. In these cases you will need to use another tool to pre-process the logs before Summary reads them. Some common tools for text manipulation are sed, awk, and perl. These are all standard on most Unix systems. Perl is also available for most other platforms. You can find more information on them and regular expressions at links provided below:

Perl regular expressions
The reference page for the regular expression language that Summary supports.
O’Reilly Perl.com
The most common source for information on the Perl scripting language. Includes links to downloads for many platforms.
O’Reilly sed & awk, 2nd Edition
A popular guide for the sed and awk text manipulation languages
MORE ON
Filters


Table of Contents | 1: What is Web Analytics? | 2: Where are My Visitors Coming From? | 3: Search Engines | 4: Advertising | 5: Revenue Modeling | 6: Design Considerations | 7: Determining Visitor Behavior Patterns | 8: Examining Subsets of Traffic  | 9: Incorporating Business Goals | 10: Bandwidth Management | 11: Site and Server Diagnostics | 12: Investigating Troublemakers | Appendix A: Making Reports More Usable | Appendix B: Technical Details of Metric Accuracy

Copyright 2002 by Summary.Net - Updated 16.Apr.2002