Ive been testing the new search server being developed by Musicbrainz. Currently Musicbrainz uses Lucene but the new server will use Xapian . Whilst Im not convinced that Xapian is intrinically better than Lucene it has helped the musicbrainz developers to solve some problems, and Ive been happy to help.
Monday, 30 June 2008
Friday, 20 June 2008
Jaikoz has embedded database
The new version is going to use the Apache derby database. This is a super small relational database that is 100% Java. So there should be no os specific problems , and because it can be embedded in your program there is no db install or services to install.
I am currently using it to store release and artist lookups from Musicbrainz so that a release/artist has to only be checked once. This already occurs in the current version of Jaikoz but everything is held in memory so memory becomes a problem over time, and the results are not kept between restarts of Jaikoz.
Im also taking advantage of this locally stored release data so that I dont actually have to a lookup for every track if I can find a good match by looking at already downloaded and used releases. Consider the usual case of all tracks in a release being looked up if the metadata is good enough the new system will only require
one track, artist and release lookup for the first track, all the other tracks can be matched using the downloaded release info , whereas with the current system a track query is done for every track.
So Jaikoz should be faster and use less memory.
In subsequent versions I want to use the database to hold track metadata instead of it being held in memory, this will then allow Jaikoz to be used on super large collectiuons without runing out of memory. But I'll never make the database the sole reposiitory of any info, so it will never be neccessary to preserve the database in order to access your metadata - it is only a tool to improve memory usage and performance.
I am currently using it to store release and artist lookups from Musicbrainz so that a release/artist has to only be checked once. This already occurs in the current version of Jaikoz but everything is held in memory so memory becomes a problem over time, and the results are not kept between restarts of Jaikoz.
Im also taking advantage of this locally stored release data so that I dont actually have to a lookup for every track if I can find a good match by looking at already downloaded and used releases. Consider the usual case of all tracks in a release being looked up if the metadata is good enough the new system will only require
one track, artist and release lookup for the first track, all the other tracks can be matched using the downloaded release info , whereas with the current system a track query is done for every track.
So Jaikoz should be faster and use less memory.
In subsequent versions I want to use the database to hold track metadata instead of it being held in memory, this will then allow Jaikoz to be used on super large collectiuons without runing out of memory. But I'll never make the database the sole reposiitory of any info, so it will never be neccessary to preserve the database in order to access your metadata - it is only a tool to improve memory usage and performance.
Tuesday, 10 June 2008
PlugIns
I'm thinking about adding support for writing Plugins. Would this be of interest to anyone and if so would you be happy to write it in Java , or would you prefer a scripting language like Python ?
Scripting
I have been improving the Rename Filename/SubFolder Masks to allow a bit more control over how your files are named from your tags
i.e $if(%bestartist%,%bestartist%-)$if(%album%,%album%-)$if(%trackno%,%trackno%-)%title%
would could give you
The Cribs-The New Fellas-01-Hey Scenesters!.mp3
or
The New Fellas-01-Hey Scenesters!.mp3
depending on your metadata.
This syntax comes from http://wiki.hydrogenaudio.org/index.php?title=Foobar2000:Titleformat_Reference#.24if.28cond.2Cthen.2Celse.29
but for this release Jaikoz will not support much more than the simple if statement. However in later releases I plan to support more functions using a suitable templating/scripting language.
i.e $if(%bestartist%,%bestartist%-)$if(%album%,%album%-)$if(%trackno%,%trackno%-)%title%
would could give you
The Cribs-The New Fellas-01-Hey Scenesters!.mp3
or
The New Fellas-01-Hey Scenesters!.mp3
depending on your metadata.
This syntax comes from http://wiki.hydrogenaudio.org/index.php?title=Foobar2000:Titleformat_Reference#.24if.28cond.2Cthen.2Celse.29
but for this release Jaikoz will not support much more than the simple if statement. However in later releases I plan to support more functions using a suitable templating/scripting language.
Wednesday, 4 June 2008
Jaikoz roundtrip processing of genres
I now have Jaikoz retrieving tags as genres from Musicbrainz, but not only that it can submit genres back to Musicbrainz. This means you can easily help build the Musicbrainz folksonomy, and you can use Musicbrainz to backup genres for your tracks. With a little bit of fiddling around it to easily tag your tracks with non genre information - such as the 'owned' tag.
The retrieve genres algorithm is as follows:
Get all tags for track
If any exist strip out any one we dont want using configurable blacklist
Favour tags that match Winamps genre list
If there are multiple matches pick the most popular tag one
If there are no tags that match Winamps genre list pick the most popular
If there are no matching tags repeat at release level.
If there are no matching tags repeat at artist level.
Only one genre is picked at the moment because most media players and audio formats only support a single genre so adding multiple genres doesnt seem to help much, but if you disagree let me know.
The retrieve genres algorithm is as follows:
Get all tags for track
If any exist strip out any one we dont want using configurable blacklist
Favour tags that match Winamps genre list
If there are multiple matches pick the most popular tag one
If there are no tags that match Winamps genre list pick the most popular
If there are no matching tags repeat at release level.
If there are no matching tags repeat at artist level.
Only one genre is picked at the moment because most media players and audio formats only support a single genre so adding multiple genres doesnt seem to help much, but if you disagree let me know.
Subscribe to:
Posts (Atom)