Blank

Archive for the 'song-list' Category

Hacked

Wednesday, June 25th, 2008

This blog has been hacked in a very peculiar way. See the summaries for the various blog entries on Google:

Google’s cache and the pages themselves seem “clean” - no mention of various drugs and such and no “bad” links. So I am unsure what is the benefit to the hackers. In any case I used this as an opportunity to upgrade to WP 2.5.1.

Hopefully the problem will go away when Google & Yahoo reindex.

New Release

Thursday, May 31st, 2007

Finally I’ve upgraded to Rails 1.2.2 with a new database (now around 3100 artists) and added DVD and Book boxes to the song list page.

Crossed the 2000 Mark

Sunday, March 4th, 2007

I just released a new version of the database. There are now over 2,000 artists listed. This is of course just a drop in the sea of available music.

I’ve started a new job, so I have a limited amount of time to enhance the site and collect more data. At the moment I am trying to get the current content indexed a bit better by google and yahoo. This has been challenging - more in a future post.

UTF-8 and Normalizing Latin Names from Amazon

Monday, January 8th, 2007

I managed to avoid understanding anything related to UTF-8 until yesterday. I had a few hacks that allowed me to normalize song titles coming from Amazon so I can match them to the titles in other online libraries. For example:
“Danza de los Ñáñigos” would be normalized to “Danza de los Nanigos”.

My old hacks did not cover many of the Latin American characters with diacritics and limited my ability to collect Spanish speaking artists (Ferret crashes when faced with the UTF8 Multibyte characters). Unfortunately I couldn’t find a library that allowed me to do what I wanted (though this perl reference was a good explanation).

I ended up with an improved hack where I manually built a translation table for the extended latin characters, use STRING::unpack(’U*’) to translate U+00CO thru U+017E to the corresponding ASCII equivalent characters. Amazon Webservices do insert some multi-byte control characters in some track titles which I now just discard.

It seems that Rails 1.2 may have some UTF-8 libraries included - looking forward to that. Here are some other links that I found useful if you are trying to get a handle on UTF-8.

  • Wikipedia Article that demystified the representation and what each bit is. The variety of representations in different articles (Octal, Hex or decimal) is very confusing.
  • Great dynamic Unicode/UTF Table
  • This article that has code and a discussion helped understand some of Ruby’s problems with Unicode/multi-byte representations.
  • Unicode Hacks - Rails support for Unicode - seems this is what will be included in 1.2. Read the documentation.

Observations: a few days after launch

Friday, December 8th, 2006

I spent the weekend updating the database and cleaning up some visual issues. Some lessons learned:

  • When redeploying - I must kill old fastcgi processes.
  • I use caching for the top two levels of the website - the results are the same for every page and don’t change over time. The third level data is small and fast enough to let be. However for every cosmetic change I make, the cached pages need to be regenerated. I have placed cache pages in a separate directory for easier management, but don’t want the first user on each cached page to wait. One option is to generate them all on my staging site and then symlink. That doesn’t work because the internal links that rails generated are absolute not relative. I ended up writing a script that make requests for all the pages
  • Not many have visited so far - the search engines have not indexed it yet. I am getting some traffic from The Real Book Listening Guide and got some in the last couple of days from Happycodr a website for Rails applications.
  • I’ve added some more data to the site, the tables are getting quite big - I need to do some optimizing of database access.

Napster Music Webservices

Saturday, December 2nd, 2006

Napster has a webservice that seems to be unpublished, but it is used in their website to get information on artists, albums and tracks. This is what I have discovered about the API which I use for Song-list.net.

I’ll detail the query language and not the results as they are in XML and self documenting. The data provided is quite rich.

Keyword Search Examples
Querying by ID

Artists, albums and tracks all have 8 digit IDs that can be used to query the particular item.

Playing Music

You can use the the informatin gathered to play the music through a browser:

Let me know if you discover more features.

First Post

Monday, November 27th, 2006

This blog is for writing about song-list.net, and the technology used to build it. I’ve enjoyed lots of information people have posted online. Hopefully I can contribute a bit myself.

song-list.net is a side project (a distraction from) my main project which is expanding the capabilities of The Real Book Listening Guide (But that is a topic for another post).

The website was written using Ruby On Rails , MySQL and use of Amazon, Rhapsody (published), and Napster (unpublished) Webservices. Data from iTunes and eMusic was screen-scraped. In the upcoming weeks I hope to document some experiences using these tools.

The data collected so far is minimal (about 100 artists) since I wanted to go live with the application and see if it sparks some interest. Collection of data is automatic, but does require a human eye because all data is dirty. Over the next few months I should be able to improve the tools, the quality and the quantity of the data.