Moving downstream: October 2007

Monday, October 29, 2007

FriendFeed continued

So I actually got a FriendFeed account! You can check my feed at:

http://friendfeed.com/michelgoldstein

My current "services" are:

This blog
del.icio.us
Google reader
LinkedIn
Netflix queue
SmugMug

Now you can see how boring I am.

Sunday, October 28, 2007

Why haven't I seen this coming:

FriendFeed

It's a site (currently in private Beta, so I can't really give too many details) that aggregates all the public information it can get from people you consider your friends. You can see what this person added to their Amazon wish list, what they have posted on their blogs, YouTube... And even some things that I didn't even know you could monitor: shared links on Google Reader, favorite songs on Last.fm, things that you dug on Dig... It's quite scary the amount of personally-tied information available out there.

But the concept of the site is quite good, in my opinion. If you have a bunch of friends that are highly connected and use web tools all the time, it would certainly be quite cool to get a feed on what they are up to. I know I miss blog posts from friends of mine all the time (well, ignoring the fact that most of my friends gave up on their blogs - what a shame).

What is the most exciting thing about it for me is that it doesn't limit you to a system. MySpace, orkut, and probably Facebook probably can provide this kind of updates for you in their system; but they can't provide what happens outside their system.

On the other hand, what worries me is that amount of different places that you have to go to keep your list of friends up-to-date. You have to sometimes go through multiple-step registering processes to get each person into the system. And then there is the upkeep (which is probably much more complicated in the case of keeping track of friends' actions on the internet). But we will see. I signed up for Beta testing it and I'll let you know how this turns out.

Thursday, October 25, 2007

The silly A380

So the Airbus 380 from Singapore Airlines finally took off. It's silly:

Video from National Geographic

Certainly a tech geek's paradise on this configuration. Unfortunately I won't be seeing any for a long time (and probably won't be willing to pay the price to travel in any for even longer)

Wednesday, October 24, 2007

Last one for the evening

Alright, I had to post this other ones:

New "Meme": Manly or Self-Sufficient?
Do Good Things

Those are related: lists of things you should do before you die. Lots of things there that I haven't yet done, of course. On the first lists, I can even claim I have done most of them. The final list is much more interesting though. Here is the list with my comments:

1. Found a utopian colony. - that you find every day
2. Integrate by parts. - I've done that way too many times
3. Decode the Voynich manuscript. - hehe
4. Defy gravity. - what is gravity, anyway? There is no spoon...
5. Recover lost treasure. - Treasure is such a vague word
6. Translate the pre-Socratics. - Another good thing to do on your spare time
7. Make love on the 50-yard line. - ok...?
8. Raise a pig and make sausage from it. - Can it be a dog?
9. Lead a witness. - I'll work on that
10. Outrun a bear. - I won't work on this one
11. Raise a point of order. - politics...
12. Memorize Paradise Lost. - memory?
13. Swoon. - That I don't remember...
14. Wake up in Vegas in a stranger’s house. - What happened there is a thing of the past
15. Collapse a wavefunction. - Also something that I've done way too many times
16. Run for public office. - Sorry, I can't
17. Resign in disgrace. - Hehe
18. Prove Fermat’s Last Theorem. - Can't do that, or else people will forget about poor Fermat
19. Hack into NORAD. - No comments about this one. Who knows who is reading this?
20. Problematize a binary opposition. - cool
21. Square the circle. - much more interesting
22. Kiss in anger. - huh?
23. Batten a hatch. - Sports?
24. Unscramble an egg. - Well, it's just a matter of what is an egg
25. Donate to charity. - cheap plug

Resign in disgrace... That's my next goal!

Tuesday, October 23, 2007

The importance of sleep

In my general "I don't want to go to sleep" evening, while going around reading articles, I came across this interesting report on Scientific American:

Can Lack of Sleep Cause Psychiatric Disorders?

I can't say I'm very surprised by the results shown here, but I just thought it was ironic to be posting this less than 6 hours before my alarm will go off and I'll have to wake up. So I had to do it.

Freebase

Tonight, instead of going to sleep after a quite intellectually busy day, I decided to use my Alpha account to Freebase and give it another try. If you are not aware of Freebase, here is my summary: it's a website that was built to allow people to create and maintain structural data about things freely. It's very web 2.0-y, with lots of annoying AJAX that half-work, but are quite powerful.

There are many ways to review this website. I'll try to touch a little on each of them:

1) Interface: as I mentioned, it's quite annoying. It's full of hover javascript actions, with double clicks, single clicks, objects that appear when they want to appear... In a way, they had very good ideas, but javascript is still an unpredictable beast. Sometimes it works great, sometimes it's so slow (especially when you are doing AJAX calls to determine what you should show when the user has his/her mouse on top of the element) that it makes it just hard to use. So, great ideas, not very great execution.

2) Features: it's actually quite interesting feature-wise. It contains three different "core" elements: types, elements and attributes. All have hierarchies. Elements can have as many types as they need. Also types makes it so that you can select which attribute to fill in for an item. Only after adding a type that contains the attribute, you can add the attribute to the item. What is its fallacy? Simple: it contains a data representation model. When you constrain yourself to something, like the explicit separation between elements and types, you are restricting what type data you can represent. Also, their representation user interface does not help you to create aggregator elements. So, if you want to say that a person did a job between two specific dates, you are out of luck.

3) Data: It's not that easy to tell how their data is. They did a general import from wikipedia, and most articles that I've seen come directly from there. User contributions are harder to find. So, everything that came from wikipedia isn't very well-structured and not necessarily authoritative. Also, because of limitations on how to represent things, you can find the same data repeated, or very similarly defined with different models. For instance, I was trying to look into Mstislav Rostropovich. He was set as a musician, but not a conductor or cellist. So, my first task was to add those types. After I've created Cellist, I've seen that other people have used conductor and cellist as professions and not "is-a". Which is valid, but just a different model. And those examples are everywhere.

My conclusion: it's an interesting experiment, but it lacks a lot of features to make it a viable solution for knowledge representation. There are other systems out there that I want to have a look:

Twine
Powerset

None I can really tell what they are about, but we'll see what comes out of it. I've sent my request to be added to their beta testers, so I'll just wait.

Thursday, October 18, 2007

Discussing data representation

It's interesting how data representation discussions always teach me something. I've read uncountable papers about it, but it seems like it takes me actually talking about it, arguing with different-minded people, to understand some of the concepts behind what exists.

So, today it was a discussion that actually opened my eyes to the "beauty" of RDF. I'll have to admit that it's sometimes hard to work with all the verbosity of it. It's also strange to have to create intermediary "fake" elements, just so that you can tie things together in a more coherent way. And the more elements you add, the more expensive calculations become and so on.

Why is it good then? Because its very simple triple limitation provides you with one of what I'm starting to believe to be the most powerful thing in data representation: schema compatibility.

Schema changes all the time and it's always going to be painful to maintain it. However, if when your change can't really do something that will break the expected elements at a certain section of your graph, it makes everything much more powerful. Let me try to explain with an example:

You are modeling a catalog of items and somebody tells you that you have a concept called "list price". The first thing that comes to your mind as a modeler is that list price has one value for each currency of a given marketplace. So that's how you model it and go on with your life, making good use of your new attribute.

Then comes a request to aggregate the data from an already-existing, very-similarly-modeled catalog, but that had one important difference: it was better tailored to different marketplaces. When you start the import procedure you suddenly realize that your assumption that "list price" is a single value for a single-currency marketplace is not valid any more. In some places of the world, for imported products from neighboring countries, customers like to see also the list price on the original currency!

If your catalog is very database-driven, you are very likely looking at having to change your schema, maybe backfill all your items and have some sort of either downtime or time that your catalog is in transition, so not completely consistent. It's a lot of pain that just keeps growing as your catalog grows.

Now, if your underlying data representation was just this RDF triples graph, adding a new edge of the same type won't really cause any problems. You have only to fix the validation code and, maybe, add some filtering code to make sure that clients that consume your data and can only consume the list price in the marketplace's currency only see the data they were used to see and voila, you are done. No backfill, no database cleanup, no database schema changes (but you do change your RDF schema to increase the cardinality of the connection restriction).

Oh, well, unfortunately sometimes you just have to live with other options due to optimization, so that you can handle millions of updates a day with a reasonably small fleet of machines. It's what engineers were bred to do, I guess.

3 years working for Amazon

So today I'm celebrating the fact that I have been at Amazon.com for 3 years. I don't know if celebrating is the right term for it, but, in certain ways, it surely feels like a celebration. Not because I'm getting RSUs today and Amazon share price is the highest it has been since I have joined, but because it's been 3 years.

The question that came to my mind was: am I closer to my ideal that I was seeking when I joined? I feel like I don't quite know how to answer this question. In many ways, yes. I have learned a lot in the last 3 years, and I'm still learning. This is the inwards-looking view of it. The outwards looking one, the one that looks at the organization that I'm working for, the company, and the world that this company serves, do I think that I'm closer to what I was expecting to get?

Without entering information that I wouldn't feel very comfortable in writing, I think the answer is "maybe". There are some signs of improvement. However, I feel like there are many more signs of loss on my part, of decisions that what I had in mind will never get done. It could probably mean that I've become more experienced and can tell what is feasible and what isn't; but it could also mean that I'm losing the dare to dream. I'm becoming engrossed the the operations of life.

Anyway, 3 years... Quite a chunk of my life here. Almost 7 years in the U.S.A. An even bigger chunk of my life.

Wednesday, October 17, 2007

Excel 2007 Bug

When I read this last night I knew I had to post it:

Excel 2007's numeric display bug

It's one of the weirdest bugs I've ever seen. I've tested the bug myself and it's really there. An interesting challenge to figure out what is going on. Maybe I'll use it as an interview question some day! :-)

Tuesday, October 16, 2007

Commas and quotes

I've mentioned in the past the fact that I find silly that in American English you have to put the commas inside the quotes. Because of this, I had to post this story from Worst than Failure:

It's a Different Set of Rules

Depressing...

Audio: what is enough?

Yes, I know I haven't been blogging much lately. It's not because I haven't had ideas to blog about, just that I think I haven't had many interesting conclusions of my ideas. So, what do you do when you don't really think you know you have something interesting to blog? You post links! So here is my link for today (I was planning on sanding more links, but I'll keep it simple for now):

Is really expensive audio equipment worth the price? Very interesting article triggered by an amazing new speaker cable that was priced at US$7250.00!

This also reminds me of hearing that there are some interesting developments in the music industry lately, with some bands deciding to break away from normal music industry and provide their albums online for whatever price you want to pay. Interesting concept, but I'm not sure it's big enough yet to shake the industry in any way.

Where do I think the music industry is going? Nowhere... I continue thinking that we will have the classic competing forces: to one side the bands that are usually not very good being sponsored by a lot of money and selling a lot just because of exposure, and smaller bands that are bennefitting from the lower cost in producing albums and have a wide range of quality. Some very good things, but most of the time they are buried in the noise of bands that probably just had way too much free time in their hands.

Diversity is great, but it can be claimed that if you lower the bar you will just increase the amount of noise you get. Decrease the proportion of good music to bad music that is recorded.

Thursday, October 04, 2007

Back from Yellowstone

So I'm back. Today was a day of fruitful discussions (or at least I like to fool myself thinking that they were fruitful) and catching up with what happened in the days I was out. Also, I've started to organize my pictures from the trip... About 900 pictures... It's going to be a long process to clean up everything and then upload it somewhere for people to access it.

But enough about today (or maybe yesterday). Some words about the trip itself: it was fantastic! Grand Teton and Yellowstone are amazing places. All the free-roaming animals, the strange rocks, breathtaking waterfalls, and scary hydrothermal features makes it a unique place. Something people have to visit to understand.

Weather was alright. It was the end of the season and the time that weather becomes a little bit less predictable. So we almost got snow every day, except for Friday and Sunday (which was our most active walking around day and the best weather overall). Between Monday and Tuesday it snowed about an inch. This made us a little scared as we had to drive back through passes to Idaho Falls to catch our flight.

Everything ended up working fine. There was no snow on the roads and we arrived at the airport 3 hours before our flight. Enough for us to... do nothing for a long time. I've read some of my book, some articles through my blackberry (trying hard not to read emails), and, soon enough, we were back home.

Now it's getting late and I should really go to bed. Just because I slept a lot during my vacations (there isn't much to do there when it's dark outside), I shouldn't just decide on compensating by not sleeping for the next few days...

Moving downstream