Moving downstream

Thursday, September 27, 2007

Music

So the Amazon MP3 download store launched... So I had to go and check it out, of course! What happens when suddenly I am given a chance of quick gratification? Well, I take it! So I've been rediscovering music. Going through albums or music that I always wanted but just never had the "courage" to buy and just downloading it. It's not that expensive (I've paid so far an average of US$8 per album) and it's so easy...

Oh, don't think that I've spent a fortune. It was only about 6 albums so far. And I'm pretty happy about it. So excited that I ended up staying up until almost 2 am yesterday listening to three different versions of "...which was the son of..." by Arvo Pärt (one of which I was singing).

Anyway, music is fun. I just wished I had more time to listen to it outside being in the bus (way too noisy to enjoy) or at work (which I tried yesterday and just ended up giving me a headache instead of allowing me to concentrate better on what I wanted to do).

Sunday, September 16, 2007

New month, new year, just thoughts

So it's 5768... In these terrible days between Rosh Hashana and Yom Kippur, we are supposed to sit and think about all the terrible things that we have done in the last year and ask for forgiveness from the people that we may have hurt. I can make abstract claims that this blog is one of the things that have suffered this year, but it's not why I'm posting today. I'm posting today because in the middle of all these terrible and depressing thoughts, when you get your mind into thinking mode, it brings also some "positive" ideas.

What makes this even more interesting and cyclic is that these ideas end up, for me, being even more torturing than the thought of the past. It's worse to think about what options you will have in the future and that you probably won't accomplish any of them, than to think in the past.

Out of these ideas that have been torturing me, some I can't really write about here. Most have to do with product reviews. I feel like most review websites out there could really benefit from a more scientific approach. Not only this, I also think that any company that is trying to make a product being successful would benefit from a third-party company testing its product against competitors following scientific evaluations and then allowing time for the original company to analyze and criticize the methods employed. This can only happen if:

1) You have all the methods explicitly explained (not only ratings like: cleaning abilities - Great; smell - Good)
2) You have a relationship with the manufacturer and allow for unedited reply to analysis
3) You present all this to customers and allow the customers to suggest tests to be done (I care about MP3 players that can fall from my pocket and still work)
4) Allow also consumers to explain what their most important features in an item are to: (a) generate a single-dimensional list of products they should look into; (b) provide feedback for manufacturers on what customers are looking for so that they spend their R&D on things that matter.

If you create this cycle, you might have high-quality, authoritative item reviews that will provide consumers with the power to select the best product that matches their needs and manufacturers to modify the products to better satisfy customers.

There are examples out there of specialized sites that try to approach things like that, but I'm yet to see (it doesn't mean that it doesn't exist, just that I don't know of it right now) a system that includes all these feedback approaches.

The rest that I'm interested in is how to represent all this data in order to display it on some website and learn with it about what makes a good product. Also, there will always be some subjective component to all reviews. How this can be split out of the "scientific" part is tricky. Any good statistician knows that all statistics are biased towards what you want to prove or disprove. The important part is to make it as explicit as possible.

Oh, well, anyway, I'm still thinking about things. I wanted to send a lot of emails yesterday and today to friends, but didn't get to send any. I was just reading things, doing some random coding here and there, I've watched "The Corporation" - great movie, by the way, I've also played some Wii and DS (so odd), and, shockingly, I've even done some art. Nothing exciting, I'll have to admit.

Ok, time to go to bed.

Thursday, August 30, 2007

Silly things that keep me up

Sometimes there are some silly things out there that keep me up until late at night. Last night this was my reason:

http://www4.wiwiss.fu-berlin.de/factbook/snorql/

Extremely addictive! The data is a little problematic in some classic areas like units of measure (are implied from the relationship type, but not explicit anywhere) and uneven structuring (contains things like "factbook:publicdebt '50% of GDP'" or sometimes some comma-separated values to deal with lists of things, instead of having multiple relationships)

But you can query for things like: give me the countries that have a life expectancy greater than 65 years and does not produce any oil:

SELECT ?name WHERE {
?x a factbook:Country .
?x factbook:lifeexpectancyatbirth_totalpopulation ?life .
?x factbook:countryname_conventionalshortform ?name .
?x factbook:oil_production ?oil .
FILTER ( ?oil = 0 && ?life >= 65 )
}

(I first tried the neighboring countries of these countries, but got some exception of missing table... odd)

And here is where this link came from: Open Data: Information wants to be linked

What do you need to be intelligent?

There was an interesting post at Cognitive Daily today (well, technically yesterday):

Does an artificial intelligence require a body?

Certainly the authors and the readers courageous enough to leave a comment don't quite agree with this assertion (well, at least most of them). I don't agree myself, but with a small exception: I believe that the only way you can have something to be intelligent is if it can interact with the environment. In other words, you can't make something that is intelligent by just making it consume monstrous amounts of data (e.g., Google's amazing search index will never be intelligent). Intelligence comes from the feedback to the data itself, to its ability to organize and predict data based on organization. You can't correctly predict if you don't have some sort of control over the things you are predicting.

Well, at least that's what I believe. I'm certainly not as "educated" in the subject as the people that read the Cognitive Daily blog, but, as most software developers, I've tried my "luck" in AI-like things and you learn...

Tuesday, August 21, 2007

25 years of the CD

It's strange how silly events like this makes people start thinking about the current state of music in the world. It feels like everywhere I look there is an article or a discussion about how crappy the music we are listening to is right now. Starting with the dynamic range (IEEE Spectrum has an interesting article with examples about it) to really sound quality - imagine that some songs you buy online are compressed to ridiculous rates like 128 kbps. And people are so used to bad quality that they can't even tell any more.

I can say that I've been feeling part of this issue. For some time I tried to listen to music on the bus on the way to work and back, and it's painful. Anything with any real dynamic range forces you to keep changing volume all the time or else you are either destroying your ears on the fortissimos, or not listening to anything on the mezzo-pianos (yes, buses are quite noisy). Right now I don't listen to anything besides the news on my iPod. And for that it's pretty useful.

There I went adding a little bit more literature to the whole "the CD has destroyed music" subject. Digital compression has destroyed music, actually. And we are still going this route, with a lot of ground to cover until we will find ourselves either lost or at a precipice.

Saturday, August 18, 2007

Doing this and that

So here I am again to write something, so that people don't think I completely abandoned the idea of blogging. I haven't really, but I'm never sure what to write. Lately life has been a little unfocused, to say the least. And when you don't focus much on one thing you end up not really having much to say.

What I have been reading? Well, I finally finished reading Steven Erikson's Midnight Tides. Good book, but quite long. Not a very easy read, mostly because there are a lot of characters spread around the world and with somehow similar names. There is a section on the characters, but it doesn't really help much (it contains useless info like:

Arahathan, a mage

)

I've also finished reviewing yet another version of the major paper of my Ph.D. research and now it's back to the reviewers for another round. Everything just takes so much time, and with my terrible lack of free time, it makes cycles even longer. Writing papers takes a lot of effort - a lot of continuous time invested on it. Every interruption is very expensive. And I've received a confirmation that another paper that I've co-written with a lab-mate in Oklahoma is getting published sometime in October, I think, on JASIST. It's exciting.

In the middle of all this excitement, I have stopped pretty much all my ontology projects. I was building three large-ish ones at home, but they have stalled and now it's hard to get back to them. They are a great exercise for anybody trying to do any modeling of anything. It's hard. You have just to learn that you will be wrong and you will be changing your mind many, many times until you are finished. So be ready to almost fully rewrite everything you do every couple of weeks for some time. Especially when you are still learning how to use Protégé and for some reason it decides to mangle all your work.

Mastie is doing alright. She seems to have odd cycles: some days she is very active and hungry. Some other days she is very scared and doesn't eat very much. But, well, I can't say I see her many times anyway. But now it's the weekend, so I have a little bit greater chance.

I guess that's all for now. I finally I'm getting a little tired after my dinner that ended with a nice shot of espresso (with some star anise ground with the coffee beans).

Wednesday, August 01, 2007

A new female in my life

(sorry, Amy asked for this title)

Anyway, it is true. Last weekend, after a very long deliberation period, I've decided to get a pet. My restrictions on which pet to get were:

- Doesn't need attention all the time
- Doesn't need to be fed every day
- Doesn't stink
- Should provide some interaction
- It's not too hard to take care of

So, after a lot of walking around and looking at pet stores, the final choice was a Uromastyx maliensis, Mastie:

She has been great so far (or so I've been told - since the weekend I haven't actually seen her, as I leave to work before she woke up and get back home after she had gone back to sleep). We'll see how it goes.

Saturday, July 21, 2007

On dining experience

I had to post this:

Dining in the Dark

It's an interesting article about a restaurant in L.A. that serves food in complete darkness. Apparently it's the latest thing in Europe, but just now arriving in the US. It must be really odd!

iPhone and the web

So now that the iPhone is out, I'm preparing to start seeing pages that were built for the iPhone, with a fancy logo in the bottom or on the side saying "iPhone optimized". Actually, Apple already has instructions for it: Optimizing Web Applications and Content for iPhone. It has some interesting limitations like:

Mouse-over events
Hover styles
Tool tips
Flash

Flash is what really caught my attention, actually. Adobe is working really hard to push their Flex framework. Can this be an important challenge for people? Not that it can kill Adobe's framework, unless somebody comes with a new framework that works on the iPhone. That could make things interesting.

Anyway, just musing about technology, while I need to get back to working on a paper that has been on my table to work on for over a month now. Time to get to it.

Sunday, July 15, 2007

Strange curve fitting

This is a cool article on Cosmic Variance:

The Best Curve-Fitting Ever

It shows that interpretation is everything when looking at real data. And you have to add to it that when you are reading a news article you never know when you are looking at real data or not.

Monday, July 02, 2007

Second Life and the future of virtual worlds

Ok, I promise it's the last post of the evening. This is a report about Mitch Kapor's talk about the future of Second Life and how it's wonderful:

Second Life chairman's stump speech takes us down the rabbit hole

It it quite interesting. I did go around Second Life once or twice. With this very limited experience, I can't say I've enjoyed it too much. But from what I've been hearing about it, it seems intriguing to say the least. We'll see where it takes us, I guess. Mitch is a crazy visionary with more money than he probably can correctly spend around. But he should keep his ideas coming, as one day we might get something useful out of it! :-)

Very strange article - the power of pheromones

This is a very strange article from nature.com:

Powerful urine is mind-altering

It talks about how strong pheromone smell can cause neurons to grow. It's just strange...

A month of half-finished posts

I guess I gave up on trying to write long and interesting posts. I have tried at least a half dozen of them, from search engines (reaction to this sort-of interesting article on ComputerWorld), to contextual ontologies, to photography (I've been trying Aperture and Lightroom - I do prefer Aperture simply because I'm much more interested in photo organization than adjustments), to responsibility. But I never finished any of them, so I decided to just write about what is going on with my life lately.

I've been busy - busy and planning on being busier in the next few months. I have something like 5 trips planned from now to early October. The most interesting of them is the one at the end of September: Yellowstone. I always wanted to go there, and now I'll finally make it there (I hope - air and hotels are already booked). It's exciting.

Work has been a little on the chaotic side. Maybe because I'm getting a little tired, maybe it's just because there are launch dates around and lots of projects coming down the pipeline. Many exciting things, many scary open problems. Coming from a research background I'm always attracted to the open problems - attracted and full of ideas about what to do and how to do things. But I know deep inside that it's all going to take a lot of time, a lot of energy, and maybe nothing will really come out of it. So I get psychologically scared away (yes, I didn't even count the number of projects I've started on my computer at home and didn't get much anywhere).

It's a little sad sometimes not to be able to follow-through with an idea. You start, get excited about it, spend time to setup the environment to work on it and then that's where you get. In the end what I see is that I isolate myself from things, and don't get anything accomplished to compensate this isolation.

Oh, well, that's how it works. Now it's time for me to concentrate on something else and, maybe, just maybe, reply to some emails that have been sitting in my inbox for over a month now...

Wednesday, June 13, 2007

Silly or what? Ebay vs. Google

Isn't this just silly:

EBay pulls ads from Google's U.S. ad network

I don't know too much of the context here, but just by looking at this article it seems like something is wrong about the age of the people that decide things on those companies...

Sunday, June 03, 2007

Sun, barbecue and flash face

In the end the weather forecast for yesterday was so static that I considered not very useful and quite time-consuming to enter it every day. The day was quite sunny with a high about 80˚F... Perfect barbecue weather. There were about 30-34 people here and quite a lot of food. The only thing that didn't quite work so well is that I was a little to quick on starting the charcoal grill and it ended up being a little too cold (the vegetarians/koshers were not very well fed from the grill).

Anyway, after a long night of sleep (I don't even know how many hours I slept last night... Something like 9, I think), I went around today looking for something interesting and found something that was quite intriguing to me:

Ultimate Flash Face

It's a "simple" (as a concept, quite complex implementation) flash website that allows you to create a face. So I've spent about 30 minutes trying to generate faces for people I knew from memory and... I wasn't able to! Probably it comes with the Y chromosome or something like that. Quite disappointing...

What is in for me today? Well, I have another party later in the day and some work to do. I also will try to go for a bike ride to try out my new bike, but I'm not sure this will actually happen. I don't know if I'll have time for it.

Sunday, May 27, 2007

I knew I was going to miss a day...

Well, at least I'll make it less than two days. Here is the current forecast for next Saturday's weather:

Weather.com: Partly cloudy, high 82°F, low 54°F, chance of precipitation 10%

AccuWeather.com: Partly sunny, High: 74°F Low: 54°F

WeatherBug.com (only partial): Mostly sunny, high 75°F

University of Washington: Not really sure where their forecast comes from, and also I'm not sure how to interpret it, so I'll paste it in and then think about the interpretation some other time: SATURDAY...MOSTLY SUNNY. HIGHS IN THE MID 70S TO LOWER 80S. && TEMPERATURE / PRECIPITATION PUYALLUP 60 43 70 / 50 40 10 TACOMA 60 42 69 / 50 40 10 SEATTLE 59 48 66 / 50 40 10 BREMERTON 59 42 68 / 50 40 10 EDMONDS 58 47 66 / 50 50 10 EVERETT 57 47 65 / 50 40 10 $$

KiroTV.com (also partial): mostly sunny. Highs in the mid 70s to lower 80s.

KOMO TV (also partial): mostly sunny. High 77°F

I guess that's it... Now writing this takes a lot of extra time, but I'll keep trying. One difference now is that there is no rain forecast on any of them. There was no rain forecast for today but I woke up and it's raining...

Anyway, yesterday was a busy day. I even got a phone call from a friend that I haven't talked to in a LONG time. But I missed the call! I'll try to call him later today, and will write about it some other time (some odd coincidences that I've decided to discuss when I know more details).

Friday, May 25, 2007

Continuing the weather countdown

Not many changes today (as of 9:20 PM PST):

Weather.com: Mostly Cloudy, max 70°F, min 51°F, chance of precipitation 10%

AccuWeather.com: Cooler with rain, High: 62°F, Low: 47°F

Countdown for the barbecue

So on next Saturday Amy and I are hosting a barbecue here at home. Barbecues are fun, but they tend to be very susceptible to weather variations. So, just because I'm a scientist, I decided to start a countdown with the weather forecast from multiple sources and see how they vary as we get closer to the date.

Today, May 25, 12:40 AM:

Weather.com: Mostly cloudy, low 55˚F, high 72˚F, precipitation chance 10%

Accuweather.com: Rain, low 49˚F, high 64˚F

Each source is a little different in the way they provide weather. Some only have a 7-day forecast, so I'll try to keep adding them as they enter the range. I'll just hope that weather.com is more correct than accuweather.

Wednesday, May 23, 2007

Brainstorming

Lately one of my favorite things to do at night is to just pick up a subject and brainstorm about it, writing down whatever I feel like is relevant. It's quite interesting, because after I finish the activity I read back what I've written and enjoy how naive and contradictory all my ideas are.

Let me give you a hypothetical example. Let's say that today's subject is knowledge acquisition. So I start by writing:

- Knowledge is defined by the relationship between elements
- There are no elements, just the relationships
- The absolute is defined by a relative sense to what is culturally or personally defined as the absolute point

And there is goes... It's a fusion of not very actionable pieces of ideas. Not very exciting then, but it continues, and gets worse:

- Branch traversal is interrupted when relationships are not found or when they become too low in interest to continue
- Interest is defined by the types of relationships between things
- Types are also relationships to moods or goals

Conclusion, I'm back to saying that there have to be some absolute elements to knowledge: here the "moods and goals". How can you think of knowledge without being able to point to something and say: the book... the table... the book is on the table (sorry for you native English speakers).

Anyway, it's fun. And what makes it more interesting is that I don't expect to get anything out of it. I'm through with creating a new project every other day like a couple of weeks ago. Time to relax and just keep my mind active.

Talking about keeping my mind active, I was reading a paper earlier today: "Mining Nonabiguous Temporal Patterns for Interval-Based Events" by Shin-Yi Wu and Yen-Liang Chen. It's an interesting paper where the authors propose methods to find patterns on the relations between interval-based events by classifying pair-wise relations using a very simple set of 7 possible relations. It's pretty and all, but when you get to real world case, the stock analysis, they make a whole set of simplifying transformations that make the problem, let's say, silly. They use three "event types": (1) the stock price increases for at least 3 days, (2) the stock price decreases for at least 3 days and (3) the stock price increases and decreases at least 3 times. Also they discuss 3 period lengths: week, month and season. Talk about arbitrary definitions here. All stock prices go up and down at least 3 times in a week. They usually do that in a 5 minute period.

In any way, there are some interesting ideas in the paper, like the process to try and predict stock movement with their correlation patterns that was found. Interestingly some of their graphs show an almost random predictive accuracy for the interesting things and very good accuracy for behaviors like "season trends". Not very meaningful, I guess. Also what I liked about the paper that sparked the brainstorming that I've mentioned before is that what they mine is not the events themselves, but the relationship between the events.

Tuesday, May 15, 2007

Learning using generative approaches

On my way back home today (much earlier than usual), I started thinking about learning methods. Learning is both one of the most interesting things that you can think of in the computer science side of the world, but also one of the most traveled paths. Everybody wants to teach their computer to be a little smarter and not expect to just repeat what you say.

So, with all this already done, why did I decide to think about it? Do I have an answer to the machine learning problem? Yea, right! I never have answers, but I do have questions and the will to read papers and pursue things that make my evenings more meaningful. And today what I'm looking at are generative models.

Like with all research, you have to start with defining what you mean by the names you use. So, the generative models that I'm talking about are the ones that the system itself generates inputs to itself. The idea behind it is that you learn by doing it. Not necessarily actually doing it, but by rehearsing doing it inside your world model, your brain. Actually, we are very good at that! We can even understand intangibles, like other people's emotions, by trying to map their experiences and facial expressions to what we would do and determine what we would be feeling if we did it, thus what the person should be feeling.

Also, another interesting example is why are people usually scared during scary movies, or sick during bloody scenes? It's because we are constantly trying to understand it by applying what happens to ourselves and we do feel scared, we do feed the sickness of our pain that isn't there.

So, back to computers: I believe (like many other researchers that have tackled this problem) that one of the key methods for robust learning (and I'm not talking here of any learning - there are many ways for computers to learn, some very good), is to allow our learners to replay and internalize what happens.

This is much easier said than done, actually. It's very easy to think of learning in the normal learning way: synchronous. You present a case and potentially the answer or a hint about the answer and you let the learner take one step towards learning the model. Then you present the next one and so on. The problem of generative models is that the "will to learn" has to be an action from the learner. The learner should determine what it wants to learn and maybe generate what it thinks it should learn.

This post is already getting much longer than anybody should handle, so I'll try to make it easier and think of an example. Let's say that you want to teach a computer to play Sudoku.

Supervised method:
The "teacher" shows a Sudoku puzzle and then a solution (that can be a step towards the solution, or a piece of the puzzle with a step towards the solution). Then it shows another puzzle and a solution. It keeps showing different puzzles (well, sometimes you can repeat a puzzle to make sure it takes another step towards the solution of that puzzle) and solutions until you decide to stop and show some new puzzles and ask for the solution to see if it learned.

Reinforcement learning:
This is actually a type of supervised learning. It's focus is either on delayed gratification: you let the computer try a couple of things and then you zap it if it's not doing very well; or you give it candy if it's doing well. Also another possibility is not providing the next correct step, but just say if it's right or wrong. It feels much more like nature teaches animals, but it is limited to what saying right or wrong can make you learn. My Ph.D. research started with looking at reinforcement learning techniques and they are slow to learn and usually not very robust (well, if you can claim robustness on something that converges in way too many iterations)

Unsupervised learning:
In this type, you allow the computer to see the different games and let it find patterns in them by itself. Then it can use these patterns to solve other games. It's usually also based on showing the learner a set of examples but not saying anything about them. It's interesting, but it's usually very limited in what it can be applied to. I'm not sure it would create a good Sudoku player.

Generative learning:
In this case you can start with any of them methods. But then you allow the learner to either pass back to the teacher a whole new puzzle and ask for a solution, or request a recall of a specific puzzle, or even stop looking for puzzles and trying to predict what the next puzzle would be. Actually prediction is a very interesting consequence of these types of approaches. You are not really any more trying to answer the question like A + B = ?, but you are now trying to look at things like A + ? = C. You know what C should be because of your learning, but now you are trying to find other Bs that satisfy the same model. Then you try to look at other As. And then you try to vary C and look again. You build the model by constructing the question and not the answer.

Again, as you must have already realized, I quickly left the realm of Sudoku. So you can't try to implement what I've just written here. Yes, and I'm aware that nobody even thought of doing it besides me - and I haven't actually implemented anything myself, just written a lot of notes on OmniOutliner about what questions I'm trying to answer. And, of course, with no answers themselves. Things like:

How to make a learner use a 4x4 Sudoku as a learning ground for a 9x9?
Should the learner actually learn position and movement too? E.g., should it interact with the outside world like: show me the element to the right of the element I've just seen
Should learning involve separate learning modules for bad and good examples?
How much can you predict before seeing an example? (how much should you learn from the instruction manual - sort of like the ontology duality of intent/extent)

Oh, well... At least I have fun and keep my mind occupied! :-)