I keep track of the Change This website. It provides PDF summaries of 12-20 pages on new books. I like reading them - I don't think I've bought a book from it yet (the to-read pile is huge) - but it is a great source of new ideas.
I really enjoy the summary from the "Back of the Napkin" book. The author is Dan Roam and he has a blog too - the book is on Amazon. His point is that simple pictures can help tell business stories.
Its a well known fact for business intelligence - so many bad charts are made using Excel and other tools. Geographers know this well since they utilize maps so effectively. Statisticians use lots of charts to explain their models.
However, these uses are mainly to show quantiative data. Roam dives into using pictures to tell qualitative stories - and his examples have fired up my imagination. I'll certainly try some of his ideas and I might even buy his book...
Sunday, May 18, 2008
Sunday, May 11, 2008
The Big Time
This week's Directions Magazine's podcast discusses the ever popular question for geographers - when will companies completely embrace the technology? The question is very similar for statisticians too - mostly, both disciplines are kept in a niche. Statistics applies to direct marketing models, quality control, some forecasting, and perhaps some supply chain problems. Geography applies to site selection and maybe some cost forecasting.
Of course, this is a complete generalization. Many companies completely embrace these disciplines in many ways. But as the podcast notes, even Microsoft and Oracle have a hard time selling geography (and their statistical products are light too).
I have two points. First, you don't need fancy software or training to get started in either discipline. Geography, for starters, isn't that complicated. Just get some zip codes, or states, and away you go. When I worked at the Bank, I helped with the Seattle Seahawk credit card - and surprise, surpise, they tend to be centered around Seattle! Don't need a master's in geography for that one. Similarly, you can do lots of statistical analysis with Excel. Just right click on a chart, and you have a regression showing you R^2.
To break from this level of analysis, you need a company that values information. Which is inherently the problem. If executive management doesn't know the difference between a right click regression and something done properly, then how can they decide? Basic statistics are taught in MBA classes and geography isn't, but statistics is typically just for your thesis. It is rarely applied to case studies or integrated into finance, marketing, operations, etc.
Secondly, both disciplines do not speak the language of the corporation - money. Locating a store at this location, because it is analytically proper - either by visual inspection of a map or complex spatial analysis, isn't the point. The root question is - how profitable will this location be? You need to transform your geographical analysis to the Profit and Loss statement - or any other financial document.
Statistics does get closer. With direct marketing models, one can easily calculate gross profit/marketing contribution and show that the marketing campaign was profitable. But it is difficult to jump from a campaign-centric financial document to a corporate strategy financial document.
Until we can make the transition, we'll be in support roles. We help evaluate risk by providing a framework to think about it. With a statistical model, direct mail campaigns have a x% response rate. With this geographic analysis, we are confident that this proposed site will have sales at least chain average.
With all the pressure of quarterly numbers, executive management is watching their financial documents. It is our challenge to be there.
Of course, this is a complete generalization. Many companies completely embrace these disciplines in many ways. But as the podcast notes, even Microsoft and Oracle have a hard time selling geography (and their statistical products are light too).
I have two points. First, you don't need fancy software or training to get started in either discipline. Geography, for starters, isn't that complicated. Just get some zip codes, or states, and away you go. When I worked at the Bank, I helped with the Seattle Seahawk credit card - and surprise, surpise, they tend to be centered around Seattle! Don't need a master's in geography for that one. Similarly, you can do lots of statistical analysis with Excel. Just right click on a chart, and you have a regression showing you R^2.
To break from this level of analysis, you need a company that values information. Which is inherently the problem. If executive management doesn't know the difference between a right click regression and something done properly, then how can they decide? Basic statistics are taught in MBA classes and geography isn't, but statistics is typically just for your thesis. It is rarely applied to case studies or integrated into finance, marketing, operations, etc.
Secondly, both disciplines do not speak the language of the corporation - money. Locating a store at this location, because it is analytically proper - either by visual inspection of a map or complex spatial analysis, isn't the point. The root question is - how profitable will this location be? You need to transform your geographical analysis to the Profit and Loss statement - or any other financial document.
Statistics does get closer. With direct marketing models, one can easily calculate gross profit/marketing contribution and show that the marketing campaign was profitable. But it is difficult to jump from a campaign-centric financial document to a corporate strategy financial document.
Until we can make the transition, we'll be in support roles. We help evaluate risk by providing a framework to think about it. With a statistical model, direct mail campaigns have a x% response rate. With this geographic analysis, we are confident that this proposed site will have sales at least chain average.
With all the pressure of quarterly numbers, executive management is watching their financial documents. It is our challenge to be there.
Sunday, May 4, 2008
Reality Mining and Good-Bye Suburbia
BusinessWeek had 2 interesting articles in last week's issue. They were a couple pages apart even and they complimented each other well.
The first is a really interesting new data series - mobile phone locations. As your phone drives around, it is constantly in communication with the different cell phone towers so that you are located if they need to make your phone ring.
Well, it turns out, your carrying a large RFID tag. The cell companies have a record of who was linked to what tower when. Meaning, they know where you have been. Imagine the terabytes of data they have. Phone #123-4567 is here and here and here. For retail geographers, this could be the holy grail.
With the phone number, you know where the person lives, so you have accurate geo-demographics. Then you can track this person anywhere. You have an idea where they work, where they play, or where they hide from the world. Beyond that, you can simulate traffic patterns. This'd be huge for billboard people - X million people with this demographic profile drive by your sign.
For restaurants, its all about traffic, and now you can put a time stamp demographic profile, byt ime of day and day of week, for this specific corner. For land developers, you have a time series of traffic patterns to identify quick growing areas or areas with changing traffic patterns.
Wow.
In contrast, we have the end of suburbia. James Kunstler, an author I am not familiar with, says that with gas prices permanently rising, the suburbs, which are a waste of resources, will decline sharply. It will be too expensive to drive from home, to school, to soccer practice, etc. He states that cheap gas (and cars) makes suburbia possible.
This is true. In the good ol' days, cities had incredible density with decent public transportation (watch Roger Rabbit again). As everyone was able to get cars and fill them with cheap gas, developers created suburbs and here we are. Kunstler thinks suburbs will be hurting within 5 years.
He also says that biofuels and hybrids won't help since they won't reduce oil consumption enough to drop demand (therefore price). He thinks "anything organized on a grand scale is liable to fall into trouble - government, finance, corporate enterprise, agribusiness, schools."
Ouch.
Well, at least we can monitor the decline with cell phone tower data...
The first is a really interesting new data series - mobile phone locations. As your phone drives around, it is constantly in communication with the different cell phone towers so that you are located if they need to make your phone ring.
Well, it turns out, your carrying a large RFID tag. The cell companies have a record of who was linked to what tower when. Meaning, they know where you have been. Imagine the terabytes of data they have. Phone #123-4567 is here and here and here. For retail geographers, this could be the holy grail.
With the phone number, you know where the person lives, so you have accurate geo-demographics. Then you can track this person anywhere. You have an idea where they work, where they play, or where they hide from the world. Beyond that, you can simulate traffic patterns. This'd be huge for billboard people - X million people with this demographic profile drive by your sign.
For restaurants, its all about traffic, and now you can put a time stamp demographic profile, byt ime of day and day of week, for this specific corner. For land developers, you have a time series of traffic patterns to identify quick growing areas or areas with changing traffic patterns.
Wow.
In contrast, we have the end of suburbia. James Kunstler, an author I am not familiar with, says that with gas prices permanently rising, the suburbs, which are a waste of resources, will decline sharply. It will be too expensive to drive from home, to school, to soccer practice, etc. He states that cheap gas (and cars) makes suburbia possible.
This is true. In the good ol' days, cities had incredible density with decent public transportation (watch Roger Rabbit again). As everyone was able to get cars and fill them with cheap gas, developers created suburbs and here we are. Kunstler thinks suburbs will be hurting within 5 years.
He also says that biofuels and hybrids won't help since they won't reduce oil consumption enough to drop demand (therefore price). He thinks "anything organized on a grand scale is liable to fall into trouble - government, finance, corporate enterprise, agribusiness, schools."
Ouch.
Well, at least we can monitor the decline with cell phone tower data...
Monday, April 28, 2008
Inflation
Stephen Few is a consultant who specializes in graphic design of business intelligence. He's a practical Edward Tufte. He did a 2 day training at Coldwater Creek and I keep up with his blog. He posted an article this week from a friend of his - Jonathan G. Koomey.
The article discusses the necessity to adjust for inflation when analyzing monetary trends over time. It is somewhat obvious - but who actually does it? Economists yes - its what they do. But do we do it for our statistical and geographic endeavours? Thanks to Allan Greenspan, and our low inflation rates, it has really been almost a moot point for the past handful of years.
I started thinking. What other inflationary type effects exist that should routinely be controlled for? If your business has changed pricing strategy, then how have you adjusted forecasted sales? Is your product mix constant? Has your percentage of sales per channel evolved?
An average order value, year over year, may have changed significantly. A probability to purchase a swim suit, when we are now focusing on dresses, has lost historical meaning. And if you are doing life time RFM - is a $50 purchase 10 years ago still $50? Do you correct that for inflation or decrease it since its NPV is less? Would you increase costs as well if you are forecasting profit?
Thus, in the RFM mix, only one component is changing and small M changes could mix things up quite a bit. It could work against recency as an inflation adjustment would increase sales - which would be the wrong thing to do. From a store sales forecasting/analysis perspective, when looking over 10 years, correcting for the changing nature of monetary values would obviously be necessary, but for a year-over-year statistical models - I'd have to play first and see.
The article discusses the necessity to adjust for inflation when analyzing monetary trends over time. It is somewhat obvious - but who actually does it? Economists yes - its what they do. But do we do it for our statistical and geographic endeavours? Thanks to Allan Greenspan, and our low inflation rates, it has really been almost a moot point for the past handful of years.
I started thinking. What other inflationary type effects exist that should routinely be controlled for? If your business has changed pricing strategy, then how have you adjusted forecasted sales? Is your product mix constant? Has your percentage of sales per channel evolved?
An average order value, year over year, may have changed significantly. A probability to purchase a swim suit, when we are now focusing on dresses, has lost historical meaning. And if you are doing life time RFM - is a $50 purchase 10 years ago still $50? Do you correct that for inflation or decrease it since its NPV is less? Would you increase costs as well if you are forecasting profit?
Thus, in the RFM mix, only one component is changing and small M changes could mix things up quite a bit. It could work against recency as an inflation adjustment would increase sales - which would be the wrong thing to do. From a store sales forecasting/analysis perspective, when looking over 10 years, correcting for the changing nature of monetary values would obviously be necessary, but for a year-over-year statistical models - I'd have to play first and see.
Sunday, April 20, 2008
Drive Time Errors
Drive time software is very popular in real estate research. Click on the map and it uses computerized roads to estimate how long it will take to drive from where you clicked. Typically, the software will estimate travel time bands, so, you'll end up with an area on the map that represents travel time of at most 10 minutes, 20 minutes...
Its interesting to note that these estimates are taken as exact. Of course, they are not. When you click on the map, you are not asked for what day, or what time of day you are estimating travel. We all know about rush hour and we all know that Saturday is different than Tuesday. In some areas, summer is different than winter.
It would be a lot to ask the drive time software to estimate this. They'd have to have a tremendous amount of data with each road segment to make it happen. The cost alone would surpass the benefit to knowing this precisely.
But one could estimate it. When making a drive time analysis, increase and decrease the estimate by 10, 20, and 30%. Have 10, 9, 8 minute drive time bands. Then conduct sensitivity analysis - what different decision(s) would you make if the travel time is really 8 or 11 minutes?
If the decision is the same, then it is probably straight forward. If the decision is different, then the right decision is sensitive to the underlying assumptions. More than likely, drive time is not the only data point that is confusing the decision, but not completely trusting a drive time estimate may help point out uncertainty.
Its interesting to note that these estimates are taken as exact. Of course, they are not. When you click on the map, you are not asked for what day, or what time of day you are estimating travel. We all know about rush hour and we all know that Saturday is different than Tuesday. In some areas, summer is different than winter.
It would be a lot to ask the drive time software to estimate this. They'd have to have a tremendous amount of data with each road segment to make it happen. The cost alone would surpass the benefit to knowing this precisely.
But one could estimate it. When making a drive time analysis, increase and decrease the estimate by 10, 20, and 30%. Have 10, 9, 8 minute drive time bands. Then conduct sensitivity analysis - what different decision(s) would you make if the travel time is really 8 or 11 minutes?
If the decision is the same, then it is probably straight forward. If the decision is different, then the right decision is sensitive to the underlying assumptions. More than likely, drive time is not the only data point that is confusing the decision, but not completely trusting a drive time estimate may help point out uncertainty.
Sunday, April 13, 2008
So You're a Statistician...
I was out for a bike ride on Saturday riding with a new group of friends. I always enjoy the conversation when people learn that I am a statistician - the first look is always the I-remember-that-class-from-college and then a what-do-you-really-do look?
After explaining, they usually get excited and have some understanding of the enjoyment I get helping people make outstanding business decisions. But almost inevitably, the comment comes up about being precise. The comment may be about my balancing our checkbook to the penny (ha!), or being intolerant of mistakes at work.
I always laugh and explain that they were not paying attention in college! Statistics is not about being precise - it's the exact opposite. It's understanding that a precise estimate is not realistic, so you have to include wiggle room when guessing the future. Rather than Tiger Woods will win the Masters today, it's he'll place in the top 10.
It's interesting to see when a "statistical" view point is newsworthy - how many articles do you read that say we are, or are not, currently in a recession? Here, it is advantageous to not be precise. Politicians try not to be precise as they enjoy having wiggle room to work with.
But, it'd also be great to hear more news like a statistician too - not mission accomplished, but we're approaching the expected value of the engagement. Rather than earnings of 7 cents per share, it's we'll do well this quarter.
Actually, that'd be interesting - more wiggle room will drive the precise people I know crazy! And that's fun to watch!
After explaining, they usually get excited and have some understanding of the enjoyment I get helping people make outstanding business decisions. But almost inevitably, the comment comes up about being precise. The comment may be about my balancing our checkbook to the penny (ha!), or being intolerant of mistakes at work.
I always laugh and explain that they were not paying attention in college! Statistics is not about being precise - it's the exact opposite. It's understanding that a precise estimate is not realistic, so you have to include wiggle room when guessing the future. Rather than Tiger Woods will win the Masters today, it's he'll place in the top 10.
It's interesting to see when a "statistical" view point is newsworthy - how many articles do you read that say we are, or are not, currently in a recession? Here, it is advantageous to not be precise. Politicians try not to be precise as they enjoy having wiggle room to work with.
But, it'd also be great to hear more news like a statistician too - not mission accomplished, but we're approaching the expected value of the engagement. Rather than earnings of 7 cents per share, it's we'll do well this quarter.
Actually, that'd be interesting - more wiggle room will drive the precise people I know crazy! And that's fun to watch!
Sunday, April 6, 2008
Right Question
Another sports post this week. 60 Minutes did a piece on Bill James, the Boston Red Sox data nerd. He’s credited with helping them win 2 World Series. Even if you don’t like baseball, it is an interesting data nerd piece, since Bill has developed innovative metrics to measure baseball performance. Bill says the secret is that you have to ask the right question.
As a data nerd, I appreciate his point. I have often found that the best way to answer a particular question is to answer another question. The trick is not answering either question, but tying them together to provide proper context for a decision. Every decent data nerd learns this trick, but let's consider 2 situations where I know the asked question is wrong.
First, consider direct marketing response rates. When I was in the credit card industry, I created very successful statistical response models that had over 2% in response. Today, those response rates are nearly ¼ of a percent! At 98% wrong, I was successful. Now, their model’s responses are damn near 100% wrong!
As the linked article says, these basis point response rates are still a success for the banks. And this is where the direct marketing field needs their Bill James - how can you be 100% wrong and be right?
For awhile, I’ve thought that the response distribution needs academic attention – its nearly binomial – you either responded or you didn’t – but it also has a continuous piece since each customer is a measurable profit stream. To address this special case binomial, the industry combines logistic regression (for response) with linear regression (for the profit stream). This combination results with profitable successes - that are 98% wrong.
For site selection, success is equally as complex. If a store does well, is it because of the location, the merchandise, or because of the specific store management? Is success viewed just within the one store or the network of stores in a market? ROIC may be the ultimate success measure from a real estate strategy point of view, but numerator in the equation is driven by sales that is dependant on the humans actually running the cash register, providing customer service, and selecting the merchandise.
As the economy continues to evolve, we are seeing more retailers changing their minds in their store strategies. Lots of closed stores for Talbots, Ann Taylor, Sigrid Olsen, and perhaps more are coming. How did Ann Taylor measure success? They are closing 117 out of 850 stores - that's more than 1 mistake for every 8 decisions (13.8%). That's a huge capital investment error.
What questions should we be asking? Who is our Bill James? Is it you?
As a data nerd, I appreciate his point. I have often found that the best way to answer a particular question is to answer another question. The trick is not answering either question, but tying them together to provide proper context for a decision. Every decent data nerd learns this trick, but let's consider 2 situations where I know the asked question is wrong.
First, consider direct marketing response rates. When I was in the credit card industry, I created very successful statistical response models that had over 2% in response. Today, those response rates are nearly ¼ of a percent! At 98% wrong, I was successful. Now, their model’s responses are damn near 100% wrong!
As the linked article says, these basis point response rates are still a success for the banks. And this is where the direct marketing field needs their Bill James - how can you be 100% wrong and be right?
For awhile, I’ve thought that the response distribution needs academic attention – its nearly binomial – you either responded or you didn’t – but it also has a continuous piece since each customer is a measurable profit stream. To address this special case binomial, the industry combines logistic regression (for response) with linear regression (for the profit stream). This combination results with profitable successes - that are 98% wrong.
For site selection, success is equally as complex. If a store does well, is it because of the location, the merchandise, or because of the specific store management? Is success viewed just within the one store or the network of stores in a market? ROIC may be the ultimate success measure from a real estate strategy point of view, but numerator in the equation is driven by sales that is dependant on the humans actually running the cash register, providing customer service, and selecting the merchandise.
As the economy continues to evolve, we are seeing more retailers changing their minds in their store strategies. Lots of closed stores for Talbots, Ann Taylor, Sigrid Olsen, and perhaps more are coming. How did Ann Taylor measure success? They are closing 117 out of 850 stores - that's more than 1 mistake for every 8 decisions (13.8%). That's a huge capital investment error.
What questions should we be asking? Who is our Bill James? Is it you?
Sunday, March 30, 2008
Two Centers of Attraction
As I was thinking about my very clever post from last week, I drove past a golf course. Now, its hard to imagine the actual course, as we are still under about 2-3 feet of snow, but I instantly had visions of golf ball maps - arcs gracefully converging to the hole. My thoughts drifted to distance decay and what map style would provide an understanding of where the troubled places for this particular hole are...
But wait - a golf course is different since it actually has 2 centers of attraction - the hole and the tee. The beginning and the end have obviously high levels of activity. Granted, the tee off area is bigger, with options for moving somewhat around - and a segmented launch pad by gender, but, at is purest form - it has a single beginning and a single end.
My thoughts then drifted to another sport with dual centers of attraction - basketball with 2 hoops (and I have all 4 Final Fours in my bracket too!). Baseball has 2 - a pitcher's mound and the batter's box, but then it gets complex with the outfield.
Golf is rather analogous to retail geography. The store is the obvious center of attraction - studied as such for nearly a century. It is typically within another center of attraction - the city, which has been studied for centuries. But the other center of attraction is where the customer lives.
Blocks and block groups simplify the customer center - very similar to the golf tee off area. Direct shopping trips may be as rare as the the hole-in-one, since customers may make a handful of stops - picking up friends, stopping by the post office, getting gas, picking up dry cleaning, etc. on the way to the ultimate destination. And once we consider the trip back, we leave golf's analogy and jump into basketball.
I'm not a golf fan, so as far as I know, this process has a well formulated analytical basis. After all, golf video games have to deal somewhat with this collection of vectors. But that is essentially what it is, a collection of vectors starting in one place ending in another. To analyze this, I'd try graph theory - a visual mathematics to explain space, vectors, and movement. Graph theory can provide solutions to the traveling salesman and the postal delivery problem, which have some similarity for our shopping trip vectors.
The result would probably end up with a gravity model subsitute, but the analytical focus would be the actual travel vectors rather than the ultimate destination's attractive measure (the Mi and Mj in the Wiki link). Or perhaps it would be a more accurate distance (Dij) value.
But the results would have another critical use - you should be able to predict where the vectors pause on the way to the destination. I'm sure golf fans can tell you where the golf balls are typically going to be on a 3 shot strategy on Augusta's 16th green. So, where would you put a store - using what business plan - with a 3 vector strategy with folks going from this neighborhood/town to a mall?
I think the billboard people do something like this for placing ads on the way to and from work. I've also heard that high traffic stores - like Walgreens/CVS or fast food places - do something like this too. Although, they'd probably be better off with decent traffic counts (and I know they use that).
So, how good is your short game? In golf that is the transition from the tee off to the putting green. In real estate, could it be what high traffic companies are doing? If not, is this an opportunity?
But wait - a golf course is different since it actually has 2 centers of attraction - the hole and the tee. The beginning and the end have obviously high levels of activity. Granted, the tee off area is bigger, with options for moving somewhat around - and a segmented launch pad by gender, but, at is purest form - it has a single beginning and a single end.
My thoughts then drifted to another sport with dual centers of attraction - basketball with 2 hoops (and I have all 4 Final Fours in my bracket too!). Baseball has 2 - a pitcher's mound and the batter's box, but then it gets complex with the outfield.
Golf is rather analogous to retail geography. The store is the obvious center of attraction - studied as such for nearly a century. It is typically within another center of attraction - the city, which has been studied for centuries. But the other center of attraction is where the customer lives.
Blocks and block groups simplify the customer center - very similar to the golf tee off area. Direct shopping trips may be as rare as the the hole-in-one, since customers may make a handful of stops - picking up friends, stopping by the post office, getting gas, picking up dry cleaning, etc. on the way to the ultimate destination. And once we consider the trip back, we leave golf's analogy and jump into basketball.
I'm not a golf fan, so as far as I know, this process has a well formulated analytical basis. After all, golf video games have to deal somewhat with this collection of vectors. But that is essentially what it is, a collection of vectors starting in one place ending in another. To analyze this, I'd try graph theory - a visual mathematics to explain space, vectors, and movement. Graph theory can provide solutions to the traveling salesman and the postal delivery problem, which have some similarity for our shopping trip vectors.
The result would probably end up with a gravity model subsitute, but the analytical focus would be the actual travel vectors rather than the ultimate destination's attractive measure (the Mi and Mj in the Wiki link). Or perhaps it would be a more accurate distance (Dij) value.
But the results would have another critical use - you should be able to predict where the vectors pause on the way to the destination. I'm sure golf fans can tell you where the golf balls are typically going to be on a 3 shot strategy on Augusta's 16th green. So, where would you put a store - using what business plan - with a 3 vector strategy with folks going from this neighborhood/town to a mall?
I think the billboard people do something like this for placing ads on the way to and from work. I've also heard that high traffic stores - like Walgreens/CVS or fast food places - do something like this too. Although, they'd probably be better off with decent traffic counts (and I know they use that).
So, how good is your short game? In golf that is the transition from the tee off to the putting green. In real estate, could it be what high traffic companies are doing? If not, is this an opportunity?
Sunday, March 23, 2008
Politcal Center of Attraction
Map nerds have a love/hate affair going on with today's news. Maps of the political climate are ubiquitous - Pennsylvania is now getting attention with its big primary coming up. Thematic maps are everywhere, which map nerds must love; however, they'd also hate them since they typically feature poor cartography.
Of course, Jon Stewart has the best take on this. He does expand it though to include business intelligence charts - my favorite is the statistical lazy susan.
So let's quickly summarize the Democratic primary maps - Obama does better in urban and uppity suburbs. Hillary does better in the blue collar type neighborhoods. I may not have the generalities entirely accurate, but let's just assume that they are.
Sometimes, these maps look like trade area maps because they tend to have natural groups - spots where the colors tend to group together. This is Obama's section; this is Hillary's. Similar to: this is Store A's trade area; this is Store B's.
Trade areas have a center of attraction - a store. The further you get from the store, the less strong the attraction is. Meaning that customers are more likely to shop at their nearest store. The trade area is broadly defined at the point in which the center of attraction for one store is greater than another.
So what is a political center of attraction? Is it a strong politician like a mayor, Congressman, or state representative? Could it be a super delegate? Do distance decay curves or the gravity model help predict voter behavior or trends in political contributions?
Of course, stores rarely stand by themselves, so you have a mall or a downtown center too. What is the political equivalent of that? It'd have to be some sort of neighborhood arrangement - which hints that the center of attraction would be a good precinct captain. Heck, this'd be easy if it was Chicago in the 1920's - it'd be Al Capone or one of his friendly Alderman!
I'm mostly curious to see how well this applies. If political geography does parallel trade area theory, then you'd have something other than just demographics, prior voting behavior, and contribution data to predict votes. I've never seen the nuts and bolts of how professional political geographers do it, but it'd certainly be interesting to learn more.
Of course, Jon Stewart has the best take on this. He does expand it though to include business intelligence charts - my favorite is the statistical lazy susan.
So let's quickly summarize the Democratic primary maps - Obama does better in urban and uppity suburbs. Hillary does better in the blue collar type neighborhoods. I may not have the generalities entirely accurate, but let's just assume that they are.
Sometimes, these maps look like trade area maps because they tend to have natural groups - spots where the colors tend to group together. This is Obama's section; this is Hillary's. Similar to: this is Store A's trade area; this is Store B's.
Trade areas have a center of attraction - a store. The further you get from the store, the less strong the attraction is. Meaning that customers are more likely to shop at their nearest store. The trade area is broadly defined at the point in which the center of attraction for one store is greater than another.
So what is a political center of attraction? Is it a strong politician like a mayor, Congressman, or state representative? Could it be a super delegate? Do distance decay curves or the gravity model help predict voter behavior or trends in political contributions?
Of course, stores rarely stand by themselves, so you have a mall or a downtown center too. What is the political equivalent of that? It'd have to be some sort of neighborhood arrangement - which hints that the center of attraction would be a good precinct captain. Heck, this'd be easy if it was Chicago in the 1920's - it'd be Al Capone or one of his friendly Alderman!
I'm mostly curious to see how well this applies. If political geography does parallel trade area theory, then you'd have something other than just demographics, prior voting behavior, and contribution data to predict votes. I've never seen the nuts and bolts of how professional political geographers do it, but it'd certainly be interesting to learn more.
Sunday, March 16, 2008
Large, Large Segments
I read in DM News on Friday about a presentation at the New England Mail Association given by Peter Grebus who is in charge of Williams Sonoma's Customer Information Management group. I found it interesting since its not very often that we can read some strategic details about the mix between direct marketing statistical models and retail sales.
The complete text is here and, of course, he talks about the economy, but here's what interested me:
Already, the company has been mailing deeper into its file with smaller versions of the catalog in geographic regions such as Texas, where it thinks a significant portion of recipients are driven to retail stores via a catalog. The company “has seen great success” with this strategy, Grebus said.
I've met the Williams Sonoma statisticians and they are top rate - it is interesting that their model is either being beat or over ruled by something as simple as a state segmentation.
Usually, statistical models can beat the pants off segmentation. If you aren't familiar with it, segmentation is the direct mail strategy of breaking customers into groups (segments) and making mailing decisions based on that. For example, Peter says that Williams Sonoma has 200 million customers - of course it doesn't make sense to mail everyone every catalog.
Assume the circulation for the next catalog is 2 million - who do you mail? Traditionally, direct marketers use RFM - recency, frequency, and monetary value - to make segments. An RFM segment may be customers who have purchased at least 3 times, made their last purchase 14 months ago, and have spent at least $500.
RFM works great. It helps answer what your best customer is, but it gets complex very fast. Would you rather have a customer who spends $2000 once every 2 years or a customer who spends $50 every month? This complexity is where a statistical model can really help.
Statistically, every customer is its own segment and when you have the process fine tuned, you are able to rank every customer from good to bad. And when you need to mail 2 million people, you query the top 2 million - done.
When I first finished grad school, I was convinced that geographers were needed to help businesses discover and understand spatial relationships. One day, while working at the bank, I was talking with an MBA who managed the Seattle Seahawk credit card - and I suddenly understood that everyone is a geographer since somethings are so basic... Of course the Seattle area has more Seahawk credit card holders than any place else - duh!
So, it is shocking to me that the Williams Sonoma direct mailing models are being beat by a state segmentation. Any segment of 20-30 million is too large against individual customer data - RFM will be able to wiggle within Texans to show that Texas customer A is better than Texas customer B since A purchased $1000 more than B in the past year.
So, what's up? Perhaps Peter isn't using a model. Perhaps the retail and direct folks aren't talking to each other. I kinda doubt these - Peter mentions that they are mailing smaller catalogs to drive retail sales which means that this is a well coordinated strategy - not only are store operations involved, but the catalog folks are deciding who gets the big or the small catalog - and certainly, a model of some sort is helping them.
Assuming that Peter is using states to beat the model and nothing fishy is going on, then the challenge is likely in the data itself. RFM is probably just at the direct level and perhaps at the corporate level - thus, the model cannot figure out the differences between channels. Meaning the model can't differentiate between customers who only shop at retail. or only shop direct, or shop both.
Or confusion may exist in the definition of success. For all the statistical model bragging, they are only as clever as they are told to be. If the model is directed to only focus on direct success, then the model will have troubles with retail sales. This could make sense - the catalog folks are trying to optimize their activities, which are very measurable (each catalog has a code to track sales from) and retail sales tend to be anonymous.
Of course, it is hard to say, but it is fun to ponder...
The complete text is here and, of course, he talks about the economy, but here's what interested me:
Already, the company has been mailing deeper into its file with smaller versions of the catalog in geographic regions such as Texas, where it thinks a significant portion of recipients are driven to retail stores via a catalog. The company “has seen great success” with this strategy, Grebus said.
I've met the Williams Sonoma statisticians and they are top rate - it is interesting that their model is either being beat or over ruled by something as simple as a state segmentation.
Usually, statistical models can beat the pants off segmentation. If you aren't familiar with it, segmentation is the direct mail strategy of breaking customers into groups (segments) and making mailing decisions based on that. For example, Peter says that Williams Sonoma has 200 million customers - of course it doesn't make sense to mail everyone every catalog.
Assume the circulation for the next catalog is 2 million - who do you mail? Traditionally, direct marketers use RFM - recency, frequency, and monetary value - to make segments. An RFM segment may be customers who have purchased at least 3 times, made their last purchase 14 months ago, and have spent at least $500.
RFM works great. It helps answer what your best customer is, but it gets complex very fast. Would you rather have a customer who spends $2000 once every 2 years or a customer who spends $50 every month? This complexity is where a statistical model can really help.
Statistically, every customer is its own segment and when you have the process fine tuned, you are able to rank every customer from good to bad. And when you need to mail 2 million people, you query the top 2 million - done.
When I first finished grad school, I was convinced that geographers were needed to help businesses discover and understand spatial relationships. One day, while working at the bank, I was talking with an MBA who managed the Seattle Seahawk credit card - and I suddenly understood that everyone is a geographer since somethings are so basic... Of course the Seattle area has more Seahawk credit card holders than any place else - duh!
So, it is shocking to me that the Williams Sonoma direct mailing models are being beat by a state segmentation. Any segment of 20-30 million is too large against individual customer data - RFM will be able to wiggle within Texans to show that Texas customer A is better than Texas customer B since A purchased $1000 more than B in the past year.
So, what's up? Perhaps Peter isn't using a model. Perhaps the retail and direct folks aren't talking to each other. I kinda doubt these - Peter mentions that they are mailing smaller catalogs to drive retail sales which means that this is a well coordinated strategy - not only are store operations involved, but the catalog folks are deciding who gets the big or the small catalog - and certainly, a model of some sort is helping them.
Assuming that Peter is using states to beat the model and nothing fishy is going on, then the challenge is likely in the data itself. RFM is probably just at the direct level and perhaps at the corporate level - thus, the model cannot figure out the differences between channels. Meaning the model can't differentiate between customers who only shop at retail. or only shop direct, or shop both.
Or confusion may exist in the definition of success. For all the statistical model bragging, they are only as clever as they are told to be. If the model is directed to only focus on direct success, then the model will have troubles with retail sales. This could make sense - the catalog folks are trying to optimize their activities, which are very measurable (each catalog has a code to track sales from) and retail sales tend to be anonymous.
Of course, it is hard to say, but it is fun to ponder...
Sunday, March 9, 2008
Retailer Globe
Directions Magazine had a podcast a couple of weeks ago in which Joe Francica discussed his vision of how a retailer could use Google Maps (or some such web map engine) to organize and display business intelligence. I left a comment, but I'm still thinking about it - as I've been busy designing portals for years.
I've been involved with business intelligence portals for a long time. My first portal was a CD that I made in 1998 when working at First USA. I was working with a major retailer by doing a test by identifying customers who shopped at the retailer (from their credit card purchases), and then sending targeted offers to their customers and those who shopped the competition. This retailer was high maintenance and wanted many details. After the test was complete, I could just hear the phone ringing with various questions. So, I answered them all.
Literally, I came up with every test combination possible and made a web page for it using SAS. I saved the HTML files and built a javascript front end to direct users to files saved on the CD. It was a portable portal - no need to connect to the Internet or worry about security. If you had the disk, you were good to go.
I used this same strategy in 2000 at LifeMinders.com. We were a big permission email company - it seems like a commodity now, but it was in the dot com boom and even the back of my head was on CNBC (We also ran our Super Bowl ad too!). Millions of emails were sent daily and I tracked clicks to set content strategy, bill advertisers, etc. Again, I used SAS to make every bloody report possible - it was quite successful; the reports took 2-3 hours to run, but once available, they were blazing fast for the users.
I started learning how to make a true interactive portal next - designing web pages using ASP and talking to databases. Reports started to slow down a bit, but it scaled much better. My third portal used this approach which tracked direct marketing strategy for millions of mortgage direct mail pieces. Eventually, I started experimenting with mixing operational reporting with profit and ROI metrics.
At this point, I had interviewed with a huge retailer to manage their site location department. The goal was to build a globe portal - very similar to what Joe was talking about in his podcast - in which a deal maker could click on a global location on a map and get an instant 3 year P&L projection to provide an initial read of a potential site. If favorable, additional resources would properly investigate it - with the goal to eliminate wasteful investigations.
This still sticks out to me as the ultimate in spatial analysis - clicking anywhere in the globe and creating the appropriate trade area estimate - then mixing the right amount of culture and demographics (probably using segmentation) to select (or develop on the fly) the right functional form to project revenue and project costs - for 3 years. Wow.
I didn't get the job - which is great since it opened the doors to be here at Coldwater Creek. I've played since 2002 with various portal designs to direct real estate research and I've yet to find the right mix - the biggest challenge is justifying the resources for a small audience. Which makes me wonder about the ultimate challenge. The more I understand about real estate P&L characteristics, the more I appreciate that it is less dependant on geography and more about the deal terms.
Over the years, the portal business has gone big time. Books and consultants galore exist to tell you how to do it. In 2003, I did develop the first operational portal here and its done well over the past 5 years. Today, my peers are improving it utilizing Reporting Services and using Stephen Few's design thoughts - its so full of features, interactivity, and rich with content. My early portals were functional and highly targeted - today's are complex and built with an army. But this is the first part of Joe's portal - rich, operational metrics wrapped in a well designed system.
The second part - real estate strategy - is what I have struggled with - Blockbuster appears to have been successful - they won a MapInfo Meridian award for it. I would love to see this in action and better understand who the users are and what questions is it successfully answering.
The final part is linking the first two - which will be the toughest. After someone has clicked on the globe, how do you direct them to operational metrics, HR listings, and how this retail location is meeting their goals? And once in these details, how do you back out to understand how this store operates in the grand scheme of things? And how is the company doing overall? And vise versa?
Tough questions - especially since most users are interested not in the physical geography, but in the corporate geography - what operational zone, region, or district is the store in? Or, most importantly, what GL code? Which is one of the larger challenges for Joe's globe - getting the operational and physical geographies in sync for a highly functional portal.
See, I told you I've been thinking about it for a long time!
I've been involved with business intelligence portals for a long time. My first portal was a CD that I made in 1998 when working at First USA. I was working with a major retailer by doing a test by identifying customers who shopped at the retailer (from their credit card purchases), and then sending targeted offers to their customers and those who shopped the competition. This retailer was high maintenance and wanted many details. After the test was complete, I could just hear the phone ringing with various questions. So, I answered them all.
Literally, I came up with every test combination possible and made a web page for it using SAS. I saved the HTML files and built a javascript front end to direct users to files saved on the CD. It was a portable portal - no need to connect to the Internet or worry about security. If you had the disk, you were good to go.
I used this same strategy in 2000 at LifeMinders.com. We were a big permission email company - it seems like a commodity now, but it was in the dot com boom and even the back of my head was on CNBC (We also ran our Super Bowl ad too!). Millions of emails were sent daily and I tracked clicks to set content strategy, bill advertisers, etc. Again, I used SAS to make every bloody report possible - it was quite successful; the reports took 2-3 hours to run, but once available, they were blazing fast for the users.
I started learning how to make a true interactive portal next - designing web pages using ASP and talking to databases. Reports started to slow down a bit, but it scaled much better. My third portal used this approach which tracked direct marketing strategy for millions of mortgage direct mail pieces. Eventually, I started experimenting with mixing operational reporting with profit and ROI metrics.
At this point, I had interviewed with a huge retailer to manage their site location department. The goal was to build a globe portal - very similar to what Joe was talking about in his podcast - in which a deal maker could click on a global location on a map and get an instant 3 year P&L projection to provide an initial read of a potential site. If favorable, additional resources would properly investigate it - with the goal to eliminate wasteful investigations.
This still sticks out to me as the ultimate in spatial analysis - clicking anywhere in the globe and creating the appropriate trade area estimate - then mixing the right amount of culture and demographics (probably using segmentation) to select (or develop on the fly) the right functional form to project revenue and project costs - for 3 years. Wow.
I didn't get the job - which is great since it opened the doors to be here at Coldwater Creek. I've played since 2002 with various portal designs to direct real estate research and I've yet to find the right mix - the biggest challenge is justifying the resources for a small audience. Which makes me wonder about the ultimate challenge. The more I understand about real estate P&L characteristics, the more I appreciate that it is less dependant on geography and more about the deal terms.
Over the years, the portal business has gone big time. Books and consultants galore exist to tell you how to do it. In 2003, I did develop the first operational portal here and its done well over the past 5 years. Today, my peers are improving it utilizing Reporting Services and using Stephen Few's design thoughts - its so full of features, interactivity, and rich with content. My early portals were functional and highly targeted - today's are complex and built with an army. But this is the first part of Joe's portal - rich, operational metrics wrapped in a well designed system.
The second part - real estate strategy - is what I have struggled with - Blockbuster appears to have been successful - they won a MapInfo Meridian award for it. I would love to see this in action and better understand who the users are and what questions is it successfully answering.
The final part is linking the first two - which will be the toughest. After someone has clicked on the globe, how do you direct them to operational metrics, HR listings, and how this retail location is meeting their goals? And once in these details, how do you back out to understand how this store operates in the grand scheme of things? And how is the company doing overall? And vise versa?
Tough questions - especially since most users are interested not in the physical geography, but in the corporate geography - what operational zone, region, or district is the store in? Or, most importantly, what GL code? Which is one of the larger challenges for Joe's globe - getting the operational and physical geographies in sync for a highly functional portal.
See, I told you I've been thinking about it for a long time!
Subscribe to:
Posts (Atom)