« »
Jan 30
2009

The other day I was doing my daily reading and I came across the following paragraph: “Analytics people who like to cull patterns from massive amounts of data like to aggregate rather than split data. In web analytics this means treating several pages as one unit in order to know about visits that saw one or more of a certain set of pages that the analyst thinks belong together. In WebTrends and other software this is done with “content grouping” and Google has no parallel to it.” Chris Grant, Got Analytics? blog

“Google has no parallel to it!!” I have to admit that I took this statement personally as I consider Google Analytics my baby. :) So I went to my colleague, Rehan Asif, to discuss this and in less than twenty minutes we came up with the following concept:

  • Categorize pages into groups of related content.
  • Collect these pages together on one page and treat them as a single entity.
  • Specify the URLs that you want to include in each group by defining URL patterns.
  • Create a filter for each group.  Each filter will search for the group identifier and replace the entire URL with a new URL.

Here is a real example on an online shoe store where we want to take all pages that focus on specific brands (for example, Converse, Timberland, Vans, and Reebok) and treat them as one content group.

1) First, we studied the URLs and found that they contain the brand name.

http://www.domain.com/authentic-vans-shoes-satain-blackpink.html
http://www.domain.com/adidas-bg-superstar-whtblk.html
http://www.domain.com/puma-big-kids-drift-cat-jr-blkwht.html

2) Using an advanced filter, all pages with “vans” in their URL will be renamed to “/vans.html”

3) Now create filters for each brand and apply the filters to a new profile called “Content Groups”

4) Now we have created content groups that allow us look at all pages for any brand as a single entity. We can now study the links where people exit, the entrance keywords, the entrance sources, other pages they visit on the site, and more.

Now, as I like to say, the real analysis begins! :)

Tags: , ,

37 Responses to “Content Grouping in Google Analytics”

  1. Allaedin,

    I enjoyed this post very much. Thanks for taking the time to write it.

    I believe that content grouping can really help make a complex site much easier to understand from a macro perspective. In this case you have grouped like minded product pages, another alternative could be to group functional pages.

    Something like: Home page and all special landing pages. All product comparison and overview pages. All deep product detail pages. Cart pages. Checkout pages.

    In that simplistic example you could quickly see where problems or opportunities are. So much easier than looking at individual page level data. Then when you find problems, dig deeper, only where you have to.

    I wish it were a lot easier to do this in GA than it is. But hey that’s why we have GAAC’s. : )

    Great job on this post.

    -Avinash.

  2. Marco Cilia says:

    very good way to do it. We really miss webtrends-like Content grouping!

    But this practice has an issue, I think:
    it doesn’t suite for very large websites, due to character limitations in regex filter fileds. Or – better – it work well with URL rewrites enabled. But what if a page has both vans-adidas?

  3. Hi Avinash,

    Thanks for your kind words… I really love the idea of grouping pages based on functionality. I will definitely try it and share it with the world soon :)

    Allaedin

  4. Rehan Asif says:

    Hi Marco,

    Make sure URLs don’t have both vans and adidas in them. :)

    In this case, I don’t think it is very likely to have a shoe with two brands but I see your point.
    Grouping by the right keywords is key and will require cooperation from all involved parties.

  5. Tim Leighton-Boyce says:

    This can work very well on e-commerce sites where the different levels of page can be grouped together.

    For example,

    all category.asp?category=whatever pages are grouped as aCategoryPage
    all product.asp?product=somesku pages are grouped as aProductPage

    It then becomes very easy to get all manner of useful stuff like the % of traffic entering at different depths of the site, relevant bounce rates etc

    Making the rewrittent page group name stand out as ‘odd’, as in the example above is useful, so that it is clear that there is some manipulation involved.

    Tim

  6. Chris Grant says:

    Excellent! I am very happy to be wrong because the content grouping concept should be available to everyone.
    — CG

  7. Hi Allaedin, Rehan,

    nice work :-)

    Next step is to coax GA into giving us dynamic brand name filtering, but I guess we’d be better off rewriting _trackPageview in the page code.

    Best,

    Julien

  8. Marco/Tim/Julien: Thanks for the feedback

    Chris: Yes, now it is available to everyone and FREE :)

  9. Ophir Cohen says:

    nicely done :-)
    There are a few holes there where URLs may be misleading etc., but the general concept is fantastic. We can always do this for one profile and leave others as they were not to hurt the data integrity.
    Ophir

  10. Tim Leighton-Boyce says:

    @Julien: I’m intrigued by your suggestion about ‘dynamic brand filtering’. What did you have in mind, please?

    At the moment I’ve been using regex-based filters to rewrite all the brand search phrases as ‘Brand’ in some profiles, and/or create profiles which only show brand/generic traffic. And these days we have custom segments as well.

    But this is such an important matter on many sites that I think it would be great to extend the idea and come up with something more powerful.

  11. @Tim I’m hoping for the lookup table functionality to be restored in filters so that you can emulate the filters set forth in this post instead of multiplying filters (one per brand keyword)

  12. Justo says:

    Hi Allaedin / Rehan:

    Thanks booth for this pretty usefull tip. I want to suggest doing the same by using advanced segments feature (for me a more easier way to do it).

    The advantages of using advanced segments instead of filters is that we don´t need to duplicate profiles for filters and the retro-activity of segments once we make one.
    Even more, we can solve the problem for multiple groups for the same page.

    The pitfalls are that only 3 custom filters could be applied the same time.

  13. Rehan Asif says:

    Hi Justo,

    Advanced segments do have their pros and cons.
    Please let us know if you have written up an advanced segments approach; we would be interested in reading and trying it.

  14. Justo says:

    Hi Rehan:

    I just read the post for your colleague Feras explaining what I just have into my mind. He didn´t mention this as content grouping but you will see that he did what we are taking about.

    http://analytics.blogspot.com/2009/02/urban-apparel-and-advanced-segments.html

  15. Rehan Asif says:

    Hi Justo,

    You are right.
    We worked on that a while back so I forgot about it. :)

    I guess the content grouping approach you take depends on your needs.
    Sometimes profiles make sense, other times advanced segments make sense.

  16. Lavan S says:

    Thanks for the great post! This should really help me digest the content on the site I am currently working on. I was wondering, does this filter method work on subdirectories in a domain? For example:

    http://www.domain.com/product_type1/

    Contains a listing of all brands that fit that product type. Once they click on a specific product then the URL is

    http://www.domain.com/product_type1/brand/product.html

    Basically I would like to be able to do the same sort of dissection but by the directories (in this case /product_type1/ or /brand/. Is that possible?

    Thanks in advance.

    Lavan

  17. Rehan Asif says:

    Hi Lavan,

    I don’t see any reason why it would not work for subdirectories.

    If you look at the Create New Filter screenshot, you can see in Field A -> Extract A that we are looking for “vans”.
    So instead of “vans”, you can set it to whatever product_type1 or brand are.
    Lets say one your product types is “running shoes” or the brand is “reebok”, then it should work fine.

    You just have to be sure that in your URLs you won’t get a situation where the keyword you are trying to match in Field A -> Extract A isn’t accidentally in the product name.
    So if you have something like http://www.domain.com/running_shoes/reebok/cool-vans-style-shoe.html, you would accidently match the Vans filter.
    If this were the case, you would have to use a more complicated regular expression to make sure you were catching the right URLs.

  18. Lavan S says:

    Hi Rehan,

    Thanks for the clarification :) I have set up the site in that manner and I will just wait for the results to come in. Thankfully our site doesn’t have much overlap between nomenclature so this should work to filter at a product level as well as a brand level.

    Cheers,
    Lavan

  19. Hello Rehan!
    Thanks for the idea, I was wondering about creating such a core filter or using the new advanced segment to group data.
    I’m just asking myself about the use of this core filter, would it be better to output to a medium “[brand]” instead of rewriting the URI ? rewrite make me loose in the profile all data from pages… Looking forward for your idea on the issue ;-)

  20. Rehan Asif says:

    Hello Patrice,

    Why don’t you try it out and let us know if you found it useful? :)
    It definitely sounds like it might be a good idea.

    Just make sure to create a new profile for this test so you don’t accidentally and permanently alter data in an undesirable way.

  21. Vivek Talyan says:

    I was wondering if you need the advanced filter or can we do the same thing with the ‘Search and Replace’ filter. It is easier for me to get my head around the S&P filter than the Advanced filter.

  22. Rehan Asif says:

    Hi Vivek,

    I think the limitation (or perhaps advantage, depending on your viewpoint and task at hand) of the search and replace filter is that it will only replace the portion of the URL that is matched.

    Whereas an advanced filter will let you overwrite the entire URL.

  23. Niklas says:

    Hi! Great post! I have a problem(sort of the same?) that someone of You might have an solution on?

    My sites is dynamic and every page has a query that has two parameters
    ?placeid=1111&page=1

    A subsite I would like to “contentgroup” has random(e.g no great logic between the numbers)) ‘placeid’ and hundreds of them – filter restrictions make it impossible to do an include-filter (“or” between every ‘placeid’). I managed ok on one subsite but next subsite had to many pages. The placeid created a filter to long. Is there any way to “Concanate” two includefilters” – one 255 characters long?

  24. Rehan Asif says:

    Hi Niklas,

    It is not possible to chain include filters the way you want because by including some pages in one filter, you are automatically excluding everything else.
    By the time you get to the next include filter, the pages you want to include have already been excluded.

    Exclude filters are a little more straightforward and you could chain exclude filters to eliminate all the pages you don’t want included.
    However if you have many products than your list of exclude filters will be quite long and difficult to manage.

    Could you give me an example of the kind of placeids that you are trying to catch in a filter?

  25. Niklas says:

    Thanx Rehan1!
    I’ll could maybe cope with some sort of short form, now I use an include with this syntax: ?placeid=1111|?placeid=3223|?placeid=6575 etc..

    Maybe something like:” placeid=(1111|3223|6575 etc) ” would work? At least double amount of pages within one group….

    Is there a “NOT” possible to combine with exclude? That would solve it – anyone knows?

  26. Rehan Asif says:

    Hi Niklas,

    I definitely would go with your second syntax so that you don’t waste precious space with multiple instances of placeid.
    placeid=(1111|3223|6575|etc)

    Maybe it is still too early for me today – could you try explaining what you mean by not possible to combine with exclude?

  27. Lena says:

    Curious if you know of a restriction on the URI fields that prevents the usage of question marks?

    I’m having some problems getting the filter to work properly for any unique URL identifiers that include them. Is there a workaround or am I out of luck?

    Thanks!

    PS…Great article :D

  28. Rehan Asif says:

    Hi Lena,

    Questions marks are special characters and you would have to escape them so that they don’t break your regex.

    You can prefix the question mark with a backslash (\). (source)

    Let me know if that does the trick for you.

  29. Yep, that’s certainly one way of doing it!

    My concern would be that this method would sum the unique pageviews in the database across all the pages concerned, so it’s difficult to get a visit level figure for the content group.

    One alternative, if you can afford the implementation, is to make an extra call to trackPageview on the required pages.. something like this:

    pageTracker._trackPageview(‘contentLevel::Vans”);

    then in your main profile, exclude filter – contains “contentLevel”.

    In a duplicate profile, include only – contains “contentLevel”,
    then filter out the text “contentLevel::” using a S&R filter.

    I think that if you group in this way before it goes to the database, you’ll get unique pageviews (visits-ish) that mean something.

    Anyone have an opinion on this method, and am I correct in my assumption of how uniques are calculated before and after roll-up?

  30. Rehan Asif says:

    Hi Adam,

    Unique pageviews are considered by session.
    So for example, if I view page A ten times in one visit and five times in a second visit, that will count for two unique pageviews.

    In the case of content grouping, you are telling GA that the same page was viewed over and over (making it easy to group).
    Usually what we do is put each content group in it’s own profile as well and that would give us the visit count to that content group.

    Obviously there is more than one way to do many things in GA; your method might very well work so feel free to do it our way, your way, or someone else’s way. :)

  31. Nikki says:

    Hi,

    We are trying to do a similar thing on our website. We want to provide reports for individual departments based on the content they have created. This content may be duplicated across several sections of the website and not in specific or logical groups. What would you recommend as the best way to do this? The current solution we use lets us add a tag to each page which we can then search for to provide reports. Can we do somethign similar in GA?

    Thanks

  32. Rehan Asif says:

    Hi Nikki,

    Our approach is based on the URL and hopefully the URLs on your website make it possible to group in such a fashion.

    If not, there is another approach to consider.
    Google Analytics just came out with Multiple Custom Variables, which include page level tags that can be deployed.
    I think it might be similar to what you currently have deployed.
    With the page level tags, we would get a summary in GA of how many times each type of page level tag was viewed, and this should be the same as the number of times content was viewed for each department.

  33. Hello Nikki,

    Would you please provide us with some URLs of pages that you like to group.

    Also, what do you mean by “content may be duplicated across several sections of the website and not in specific or logical groups”? Are you saying same page belong to more than one department? example please?

  34. Mike says:

    Hi.

    Good article. I’m trying to implement.

    I have pages that are of the following structure: /Tire_Results_By_Size/265/70/17 and /Tire_Results_By_Size/31/10.5/15

    I want to group all /Tire_Results_By_Size/ so that I can look at Entrance Paths of all /Tire_Results_By_Size/ pages. I have a number of situations like this. This is just one example.

    I would like to combine criteria to have output following this logic:

    IF source page(s) = /Tire_Results_By_Size/265 OR /Tire_Results_By_Size/31
    THEN output page = /Tire_Results_By_Size/

    How would I use your filter to rewrite two (or more) URLs into a single, new URL?

    Mike

  35. Hi Mike,

    Follow these steps:

    - Create New Profile (ex. Name: Tire Results by Size Profile)
    - Create New Filter (Tire Results Filter)
    - Filter Type: Custom filter -> Advanced
    - Field A -> Extract A: Request URI: /Tire_Results_By_Size/
    - Output To -> Constructor: Request URI: /Tire_Results_By_Size/
    - Apply the new filter to the new profile

    Good luck,

  36. Ophir Prusak says:

    This is a great solution, though I’m thinking readers should also look into content grouping using the new custom variables in GA.

  37. Thanks Ophir,

    yes, the new MCV is an awesome feature but in regards to content grouping I feel it will be very hard to do any path analysis and Goal tracking using the MCV (just to mention few). If you have any case studies, please share with us.

Trackbacks

  1. Advanced Segments & E-Commerce | Google Caffeine SEO
  2. You can too do Content Grouping in Google Analytics | L3 Analytics
  3. Content Grouping Now Available in Google Analytics Standard Reports | Analytics |

Leave a Reply