Tuesday, December 16, 2008

Distracted

Today was getting poor day and after that I was working from home (not to have to take a bus downtown and waste an hour commuting, as I didn't have any meetings scheduled today). But it wasn't a very productive day. The whole day I was thinking about what happened and the plans for the next few days (my father is arriving tomorrow evening and staying until Saturday morning) and next few months in the new house. So... No in-depth shoe size analysis for me. Just a lot of random thoughts and observation about the most important concept of all statistical analyzes: if you don't have a model, you can't analyze anything meaningful. I can look at things and calculate averages, do general clustering and find patterns, but without a model I can't tell if what I'm looking at is meaningful.

It all goes back to my inability to assert things to people in a way that they will accept and act on it. I trust people too much and I don't trust myself enough. So, when multiple people ask for something, even when I know it shouldn't work, I decide to go ahead with it and get to the same conclusion the practical way.

So, that's where I am. The rest of the week is going to be mostly dead with my father's visit and some meetings (Wednesday and Thursday are my days full of meetings - it's great that they are concentrated this way, unless I hope to get anything accomplished those two days in particular, and Friday is my closing date and key collection). But we keep on moving forward. I have started working on a model (which is now on a paper that was invaded by information on the contractor that is fixing the roof at the new house) and that's what I'll have to make sure to settle on before I move back to analysis. Then I'll do the thing that I got at Amazon to do: analyze the Amazon catalog for patterns. In this case, figure out how to train my model and validate that it correctly represents shoe sizing.

Can't quite explain what my thoughts are, but I'll say that it's a very interesting problem. What makes it very complex is the fact that there are multiple ways of explaining to a user the size of a shoe. There are multiple size standards (including some that are very similar, like US Men's and US Women's, which are off by 1.5 or 2, depending on who you ask), with sometimes non-fully-deterministic translations between the sizes. There are also shoe width information, with multiple standards. And these "standards" are not exact. Some brands or product lines run smaller or larger than the actual number that they say, which is technically another size feature.

So, with all those features, how can you make sense of the size? That's the question that I'm trying to answer in the next couple of weeks.

0 comments: