Monday, October 29, 2007

FriendFeed continued

So I actually got a FriendFeed account! You can check my feed at:

http://friendfeed.com/michelgoldstein

My current "services" are:
  • This blog
  • del.icio.us
  • Google reader
  • LinkedIn
  • Netflix queue
  • SmugMug
Now you can see how boring I am.

Sunday, October 28, 2007

FriendFeed

Why haven't I seen this coming:

FriendFeed

It's a site (currently in private Beta, so I can't really give too many details) that aggregates all the public information it can get from people you consider your friends. You can see what this person added to their Amazon wish list, what they have posted on their blogs, YouTube... And even some things that I didn't even know you could monitor: shared links on Google Reader, favorite songs on Last.fm, things that you dug on Dig... It's quite scary the amount of personally-tied information available out there.

But the concept of the site is quite good, in my opinion. If you have a bunch of friends that are highly connected and use web tools all the time, it would certainly be quite cool to get a feed on what they are up to. I know I miss blog posts from friends of mine all the time (well, ignoring the fact that most of my friends gave up on their blogs - what a shame).

What is the most exciting thing about it for me is that it doesn't limit you to a system. MySpace, orkut, and probably Facebook probably can provide this kind of updates for you in their system; but they can't provide what happens outside their system.

On the other hand, what worries me is that amount of different places that you have to go to keep your list of friends up-to-date. You have to sometimes go through multiple-step registering processes to get each person into the system. And then there is the upkeep (which is probably much more complicated in the case of keeping track of friends' actions on the internet). But we will see. I signed up for Beta testing it and I'll let you know how this turns out.

Thursday, October 25, 2007

The silly A380

So the Airbus 380 from Singapore Airlines finally took off. It's silly:

Video from National Geographic

Certainly a tech geek's paradise on this configuration. Unfortunately I won't be seeing any for a long time (and probably won't be willing to pay the price to travel in any for even longer)

Wednesday, October 24, 2007

Last one for the evening

Alright, I had to post this other ones:

New "Meme": Manly or Self-Sufficient?
Do Good Things

Those are related: lists of things you should do before you die. Lots of things there that I haven't yet done, of course. On the first lists, I can even claim I have done most of them. The final list is much more interesting though. Here is the list with my comments:

1. Found a utopian colony. - that you find every day
2. Integrate by parts. - I've done that way too many times
3. Decode the Voynich manuscript. - hehe
4. Defy gravity. - what is gravity, anyway? There is no spoon...
5. Recover lost treasure. - Treasure is such a vague word
6. Translate the pre-Socratics. - Another good thing to do on your spare time
7. Make love on the 50-yard line. - ok...?
8. Raise a pig and make sausage from it. - Can it be a dog?
9. Lead a witness. - I'll work on that
10. Outrun a bear. - I won't work on this one
11. Raise a point of order. - politics...
12. Memorize Paradise Lost. - memory?
13. Swoon. - That I don't remember...
14. Wake up in Vegas in a stranger’s house. - What happened there is a thing of the past
15. Collapse a wavefunction. - Also something that I've done way too many times
16. Run for public office. - Sorry, I can't
17. Resign in disgrace. - Hehe
18. Prove Fermat’s Last Theorem. - Can't do that, or else people will forget about poor Fermat
19. Hack into NORAD. - No comments about this one. Who knows who is reading this?
20. Problematize a binary opposition. - cool
21. Square the circle. - much more interesting
22. Kiss in anger. - huh?
23. Batten a hatch. - Sports?
24. Unscramble an egg. - Well, it's just a matter of what is an egg
25. Donate to charity. - cheap plug

Resign in disgrace... That's my next goal!

Tuesday, October 23, 2007

The importance of sleep

In my general "I don't want to go to sleep" evening, while going around reading articles, I came across this interesting report on Scientific American:

Can Lack of Sleep Cause Psychiatric Disorders?

I can't say I'm very surprised by the results shown here, but I just thought it was ironic to be posting this less than 6 hours before my alarm will go off and I'll have to wake up. So I had to do it.

Freebase

Tonight, instead of going to sleep after a quite intellectually busy day, I decided to use my Alpha account to Freebase and give it another try. If you are not aware of Freebase, here is my summary: it's a website that was built to allow people to create and maintain structural data about things freely. It's very web 2.0-y, with lots of annoying AJAX that half-work, but are quite powerful.

There are many ways to review this website. I'll try to touch a little on each of them:

1) Interface: as I mentioned, it's quite annoying. It's full of hover javascript actions, with double clicks, single clicks, objects that appear when they want to appear... In a way, they had very good ideas, but javascript is still an unpredictable beast. Sometimes it works great, sometimes it's so slow (especially when you are doing AJAX calls to determine what you should show when the user has his/her mouse on top of the element) that it makes it just hard to use. So, great ideas, not very great execution.

2) Features: it's actually quite interesting feature-wise. It contains three different "core" elements: types, elements and attributes. All have hierarchies. Elements can have as many types as they need. Also types makes it so that you can select which attribute to fill in for an item. Only after adding a type that contains the attribute, you can add the attribute to the item. What is its fallacy? Simple: it contains a data representation model. When you constrain yourself to something, like the explicit separation between elements and types, you are restricting what type data you can represent. Also, their representation user interface does not help you to create aggregator elements. So, if you want to say that a person did a job between two specific dates, you are out of luck.

3) Data: It's not that easy to tell how their data is. They did a general import from wikipedia, and most articles that I've seen come directly from there. User contributions are harder to find. So, everything that came from wikipedia isn't very well-structured and not necessarily authoritative. Also, because of limitations on how to represent things, you can find the same data repeated, or very similarly defined with different models. For instance, I was trying to look into Mstislav Rostropovich. He was set as a musician, but not a conductor or cellist. So, my first task was to add those types. After I've created Cellist, I've seen that other people have used conductor and cellist as professions and not "is-a". Which is valid, but just a different model. And those examples are everywhere.

My conclusion: it's an interesting experiment, but it lacks a lot of features to make it a viable solution for knowledge representation. There are other systems out there that I want to have a look:

Twine
Powerset

None I can really tell what they are about, but we'll see what comes out of it. I've sent my request to be added to their beta testers, so I'll just wait.

Thursday, October 18, 2007

Discussing data representation

It's interesting how data representation discussions always teach me something. I've read uncountable papers about it, but it seems like it takes me actually talking about it, arguing with different-minded people, to understand some of the concepts behind what exists.

So, today it was a discussion that actually opened my eyes to the "beauty" of RDF. I'll have to admit that it's sometimes hard to work with all the verbosity of it. It's also strange to have to create intermediary "fake" elements, just so that you can tie things together in a more coherent way. And the more elements you add, the more expensive calculations become and so on.

Why is it good then? Because its very simple triple limitation provides you with one of what I'm starting to believe to be the most powerful thing in data representation: schema compatibility.

Schema changes all the time and it's always going to be painful to maintain it. However, if when your change can't really do something that will break the expected elements at a certain section of your graph, it makes everything much more powerful. Let me try to explain with an example:

You are modeling a catalog of items and somebody tells you that you have a concept called "list price". The first thing that comes to your mind as a modeler is that list price has one value for each currency of a given marketplace. So that's how you model it and go on with your life, making good use of your new attribute.

Then comes a request to aggregate the data from an already-existing, very-similarly-modeled catalog, but that had one important difference: it was better tailored to different marketplaces. When you start the import procedure you suddenly realize that your assumption that "list price" is a single value for a single-currency marketplace is not valid any more. In some places of the world, for imported products from neighboring countries, customers like to see also the list price on the original currency!

If your catalog is very database-driven, you are very likely looking at having to change your schema, maybe backfill all your items and have some sort of either downtime or time that your catalog is in transition, so not completely consistent. It's a lot of pain that just keeps growing as your catalog grows.

Now, if your underlying data representation was just this RDF triples graph, adding a new edge of the same type won't really cause any problems. You have only to fix the validation code and, maybe, add some filtering code to make sure that clients that consume your data and can only consume the list price in the marketplace's currency only see the data they were used to see and voila, you are done. No backfill, no database cleanup, no database schema changes (but you do change your RDF schema to increase the cardinality of the connection restriction).

Oh, well, unfortunately sometimes you just have to live with other options due to optimization, so that you can handle millions of updates a day with a reasonably small fleet of machines. It's what engineers were bred to do, I guess.

3 years working for Amazon

So today I'm celebrating the fact that I have been at Amazon.com for 3 years. I don't know if celebrating is the right term for it, but, in certain ways, it surely feels like a celebration. Not because I'm getting RSUs today and Amazon share price is the highest it has been since I have joined, but because it's been 3 years.

The question that came to my mind was: am I closer to my ideal that I was seeking when I joined? I feel like I don't quite know how to answer this question. In many ways, yes. I have learned a lot in the last 3 years, and I'm still learning. This is the inwards-looking view of it. The outwards looking one, the one that looks at the organization that I'm working for, the company, and the world that this company serves, do I think that I'm closer to what I was expecting to get?

Without entering information that I wouldn't feel very comfortable in writing, I think the answer is "maybe". There are some signs of improvement. However, I feel like there are many more signs of loss on my part, of decisions that what I had in mind will never get done. It could probably mean that I've become more experienced and can tell what is feasible and what isn't; but it could also mean that I'm losing the dare to dream. I'm becoming engrossed the the operations of life.

Anyway, 3 years... Quite a chunk of my life here. Almost 7 years in the U.S.A. An even bigger chunk of my life.

Wednesday, October 17, 2007

Excel 2007 Bug

When I read this last night I knew I had to post it:

Excel 2007's numeric display bug

It's one of the weirdest bugs I've ever seen. I've tested the bug myself and it's really there. An interesting challenge to figure out what is going on. Maybe I'll use it as an interview question some day! :-)

Tuesday, October 16, 2007

Commas and quotes

I've mentioned in the past the fact that I find silly that in American English you have to put the commas inside the quotes. Because of this, I had to post this story from Worst than Failure:

It's a Different Set of Rules

Depressing...

Audio: what is enough?

Yes, I know I haven't been blogging much lately. It's not because I haven't had ideas to blog about, just that I think I haven't had many interesting conclusions of my ideas. So, what do you do when you don't really think you know you have something interesting to blog? You post links! So here is my link for today (I was planning on sanding more links, but I'll keep it simple for now):

Is really expensive audio equipment worth the price? Very interesting article triggered by an amazing new speaker cable that was priced at US$7250.00!

This also reminds me of hearing that there are some interesting developments in the music industry lately, with some bands deciding to break away from normal music industry and provide their albums online for whatever price you want to pay. Interesting concept, but I'm not sure it's big enough yet to shake the industry in any way.

Where do I think the music industry is going? Nowhere... I continue thinking that we will have the classic competing forces: to one side the bands that are usually not very good being sponsored by a lot of money and selling a lot just because of exposure, and smaller bands that are bennefitting from the lower cost in producing albums and have a wide range of quality. Some very good things, but most of the time they are buried in the noise of bands that probably just had way too much free time in their hands.

Diversity is great, but it can be claimed that if you lower the bar you will just increase the amount of noise you get. Decrease the proportion of good music to bad music that is recorded.

Thursday, October 04, 2007

Back from Yellowstone

So I'm back. Today was a day of fruitful discussions (or at least I like to fool myself thinking that they were fruitful) and catching up with what happened in the days I was out. Also, I've started to organize my pictures from the trip... About 900 pictures... It's going to be a long process to clean up everything and then upload it somewhere for people to access it.

But enough about today (or maybe yesterday). Some words about the trip itself: it was fantastic! Grand Teton and Yellowstone are amazing places. All the free-roaming animals, the strange rocks, breathtaking waterfalls, and scary hydrothermal features makes it a unique place. Something people have to visit to understand.

Weather was alright. It was the end of the season and the time that weather becomes a little bit less predictable. So we almost got snow every day, except for Friday and Sunday (which was our most active walking around day and the best weather overall). Between Monday and Tuesday it snowed about an inch. This made us a little scared as we had to drive back through passes to Idaho Falls to catch our flight.

Everything ended up working fine. There was no snow on the roads and we arrived at the airport 3 hours before our flight. Enough for us to... do nothing for a long time. I've read some of my book, some articles through my blackberry (trying hard not to read emails), and, soon enough, we were back home.

Now it's getting late and I should really go to bed. Just because I slept a lot during my vacations (there isn't much to do there when it's dark outside), I shouldn't just decide on compensating by not sleeping for the next few days...

Thursday, September 27, 2007

Music

So the Amazon MP3 download store launched... So I had to go and check it out, of course! What happens when suddenly I am given a chance of quick gratification? Well, I take it! So I've been rediscovering music. Going through albums or music that I always wanted but just never had the "courage" to buy and just downloading it. It's not that expensive (I've paid so far an average of US$8 per album) and it's so easy...

Oh, don't think that I've spent a fortune. It was only about 6 albums so far. And I'm pretty happy about it. So excited that I ended up staying up until almost 2 am yesterday listening to three different versions of "...which was the son of..." by Arvo Pärt (one of which I was singing).

Anyway, music is fun. I just wished I had more time to listen to it outside being in the bus (way too noisy to enjoy) or at work (which I tried yesterday and just ended up giving me a headache instead of allowing me to concentrate better on what I wanted to do).

Sunday, September 16, 2007

New month, new year, just thoughts

So it's 5768... In these terrible days between Rosh Hashana and Yom Kippur, we are supposed to sit and think about all the terrible things that we have done in the last year and ask for forgiveness from the people that we may have hurt. I can make abstract claims that this blog is one of the things that have suffered this year, but it's not why I'm posting today. I'm posting today because in the middle of all these terrible and depressing thoughts, when you get your mind into thinking mode, it brings also some "positive" ideas.

What makes this even more interesting and cyclic is that these ideas end up, for me, being even more torturing than the thought of the past. It's worse to think about what options you will have in the future and that you probably won't accomplish any of them, than to think in the past.

Out of these ideas that have been torturing me, some I can't really write about here. Most have to do with product reviews. I feel like most review websites out there could really benefit from a more scientific approach. Not only this, I also think that any company that is trying to make a product being successful would benefit from a third-party company testing its product against competitors following scientific evaluations and then allowing time for the original company to analyze and criticize the methods employed. This can only happen if:

1) You have all the methods explicitly explained (not only ratings like: cleaning abilities - Great; smell - Good)
2) You have a relationship with the manufacturer and allow for unedited reply to analysis
3) You present all this to customers and allow the customers to suggest tests to be done (I care about MP3 players that can fall from my pocket and still work)
4) Allow also consumers to explain what their most important features in an item are to: (a) generate a single-dimensional list of products they should look into; (b) provide feedback for manufacturers on what customers are looking for so that they spend their R&D on things that matter.

If you create this cycle, you might have high-quality, authoritative item reviews that will provide consumers with the power to select the best product that matches their needs and manufacturers to modify the products to better satisfy customers.

There are examples out there of specialized sites that try to approach things like that, but I'm yet to see (it doesn't mean that it doesn't exist, just that I don't know of it right now) a system that includes all these feedback approaches.

The rest that I'm interested in is how to represent all this data in order to display it on some website and learn with it about what makes a good product. Also, there will always be some subjective component to all reviews. How this can be split out of the "scientific" part is tricky. Any good statistician knows that all statistics are biased towards what you want to prove or disprove. The important part is to make it as explicit as possible.

Oh, well, anyway, I'm still thinking about things. I wanted to send a lot of emails yesterday and today to friends, but didn't get to send any. I was just reading things, doing some random coding here and there, I've watched "The Corporation" - great movie, by the way, I've also played some Wii and DS (so odd), and, shockingly, I've even done some art. Nothing exciting, I'll have to admit.

Ok, time to go to bed.

Thursday, August 30, 2007

Silly things that keep me up

Sometimes there are some silly things out there that keep me up until late at night. Last night this was my reason:

http://www4.wiwiss.fu-berlin.de/factbook/snorql/

Extremely addictive! The data is a little problematic in some classic areas like units of measure (are implied from the relationship type, but not explicit anywhere) and uneven structuring (contains things like "factbook:publicdebt '50% of GDP'" or sometimes some comma-separated values to deal with lists of things, instead of having multiple relationships)

But you can query for things like: give me the countries that have a life expectancy greater than 65 years and does not produce any oil:


SELECT ?name WHERE {
?x a factbook:Country .
?x factbook:lifeexpectancyatbirth_totalpopulation ?life .
?x factbook:countryname_conventionalshortform ?name .
?x factbook:oil_production ?oil .
FILTER ( ?oil = 0 && ?life >= 65 )
}


(I first tried the neighboring countries of these countries, but got some exception of missing table... odd)

And here is where this link came from: Open Data: Information wants to be linked

What do you need to be intelligent?

There was an interesting post at Cognitive Daily today (well, technically yesterday):

Does an artificial intelligence require a body?

Certainly the authors and the readers courageous enough to leave a comment don't quite agree with this assertion (well, at least most of them). I don't agree myself, but with a small exception: I believe that the only way you can have something to be intelligent is if it can interact with the environment. In other words, you can't make something that is intelligent by just making it consume monstrous amounts of data (e.g., Google's amazing search index will never be intelligent). Intelligence comes from the feedback to the data itself, to its ability to organize and predict data based on organization. You can't correctly predict if you don't have some sort of control over the things you are predicting.

Well, at least that's what I believe. I'm certainly not as "educated" in the subject as the people that read the Cognitive Daily blog, but, as most software developers, I've tried my "luck" in AI-like things and you learn...

Tuesday, August 21, 2007

25 years of the CD

It's strange how silly events like this makes people start thinking about the current state of music in the world. It feels like everywhere I look there is an article or a discussion about how crappy the music we are listening to is right now. Starting with the dynamic range (IEEE Spectrum has an interesting article with examples about it) to really sound quality - imagine that some songs you buy online are compressed to ridiculous rates like 128 kbps. And people are so used to bad quality that they can't even tell any more.

I can say that I've been feeling part of this issue. For some time I tried to listen to music on the bus on the way to work and back, and it's painful. Anything with any real dynamic range forces you to keep changing volume all the time or else you are either destroying your ears on the fortissimos, or not listening to anything on the mezzo-pianos (yes, buses are quite noisy). Right now I don't listen to anything besides the news on my iPod. And for that it's pretty useful.

There I went adding a little bit more literature to the whole "the CD has destroyed music" subject. Digital compression has destroyed music, actually. And we are still going this route, with a lot of ground to cover until we will find ourselves either lost or at a precipice.

Saturday, August 18, 2007

Doing this and that

So here I am again to write something, so that people don't think I completely abandoned the idea of blogging. I haven't really, but I'm never sure what to write. Lately life has been a little unfocused, to say the least. And when you don't focus much on one thing you end up not really having much to say.

What I have been reading? Well, I finally finished reading Steven Erikson's Midnight Tides. Good book, but quite long. Not a very easy read, mostly because there are a lot of characters spread around the world and with somehow similar names. There is a section on the characters, but it doesn't really help much (it contains useless info like:

Arahathan, a mage

)

I've also finished reviewing yet another version of the major paper of my Ph.D. research and now it's back to the reviewers for another round. Everything just takes so much time, and with my terrible lack of free time, it makes cycles even longer. Writing papers takes a lot of effort - a lot of continuous time invested on it. Every interruption is very expensive. And I've received a confirmation that another paper that I've co-written with a lab-mate in Oklahoma is getting published sometime in October, I think, on JASIST. It's exciting.

In the middle of all this excitement, I have stopped pretty much all my ontology projects. I was building three large-ish ones at home, but they have stalled and now it's hard to get back to them. They are a great exercise for anybody trying to do any modeling of anything. It's hard. You have just to learn that you will be wrong and you will be changing your mind many, many times until you are finished. So be ready to almost fully rewrite everything you do every couple of weeks for some time. Especially when you are still learning how to use Protégé and for some reason it decides to mangle all your work.

Mastie is doing alright. She seems to have odd cycles: some days she is very active and hungry. Some other days she is very scared and doesn't eat very much. But, well, I can't say I see her many times anyway. But now it's the weekend, so I have a little bit greater chance.

I guess that's all for now. I finally I'm getting a little tired after my dinner that ended with a nice shot of espresso (with some star anise ground with the coffee beans).

Wednesday, August 01, 2007

A new female in my life

(sorry, Amy asked for this title)

Anyway, it is true. Last weekend, after a very long deliberation period, I've decided to get a pet. My restrictions on which pet to get were:

- Doesn't need attention all the time
- Doesn't need to be fed every day
- Doesn't stink
- Should provide some interaction
- It's not too hard to take care of

So, after a lot of walking around and looking at pet stores, the final choice was a Uromastyx maliensis, Mastie:



She has been great so far (or so I've been told - since the weekend I haven't actually seen her, as I leave to work before she woke up and get back home after she had gone back to sleep). We'll see how it goes.

Saturday, July 21, 2007

On dining experience

I had to post this:

Dining in the Dark

It's an interesting article about a restaurant in L.A. that serves food in complete darkness. Apparently it's the latest thing in Europe, but just now arriving in the US. It must be really odd!