The last month went rather badly. First I got distracted by another project for two weeks … and then when I finally got back to working on the recommender, I found a bunch of problems with the way my database code was working (or rather, not working).
Essentially, I made a mistake when I wrote the tests and mostly tested my DataSet? logic, and not the round-trip to the database, or even the individual queries to the database. Even though my unit tests were working, the data wasn’t being stored in the database right. I’ve added a few tests which actually run SQL queries against the back-end database after exercising the interface — and I’ll add a few more in the next week or so as I actually try running the scraper, but the couple that are there now will prevent duplicating the same bug again.
At any rate. I’m clearly at least three weeks behind schedule, although I expect I’ll be able to get back on track, it will be partly at the cost of leaving the SOM code in C++ for now or using an automatic converter to get most of the conversion to C# from C++. Overall, that’s not really a huge problem, as long as I can hook it in without resorting to a command-line interface.
The inaptly named Windows RSS Platform is actually part of IE7, not part of Windows, and therefore is available on Windows XP if IE 7 has been installed, as well as on Windows Vista (where IE 7 is included originally). However, having said that, it isn’t just for IE: it includes a complete COM API which is usable from script or the .Net Framework, and the header files are part of the Windows Platform SDK and usable from C/C++.
The RSS Platform is intended to introduce a unified approach to RSS for Windows applications, where all applications use the same RSS feed store, and a service handles downloading the RSS feeds — including enclosures if requested — and normalizes them so applications need not handle parsing all the different feed formats (that is, you only need to parse the Microsoft-normalized RSS 2.0 with extensions).
As a platform for building RSS-based applications, it’s very well done, and well thought out. It’s now ridiculously easy to create an RSS reader, since the platform removes all need to parse XML except in the weirdest situations, and allows all applications to be instantly integrated on the same list of RSS feeds … let me show you …. (more…)
6 Jan
Well, I’ve decided to refer to my SOM Recommender by the initialism SOMR (which for the sake of the argument, I pronounce “sommer” like “summer” but with an o), and I’ve been working on it for about a month. It’s been a busy month outside of working on this project, with end-of-year stuff at work, and of course, the Christmas holiday with family, but I’m basically tracking correctly on my schedule despite that. (more…)
After some extensive research, I decided to go the project route instead of the thesis route at RIT, primarily because I’m not immediately going to be working toward a doctorate. But also because I’m still a bit more interested in the code part of the research, and the project defense is rumored to be easier than the thesis, without the requirement to prove that my idea is original. At the end of the day, I’m still working full time, and have a family, so my primary concern right now is to get my degree completed as soon as I can.
My project proposal was accepted, and I’ve started working on code and databases. I’ll be posting regular updates here, along with a link to my subversion server, but I wanted to start by posting a short summary of my project.
The project comprises designing and implementing a hybrid recommender system for web–pages which uses data from a social tagging system to recommend interesting items to users. For the initial implementation, the tagging data will come from del.icio.us, one of the oldest and largest public social bookmarking systems. The system will cluster items using a self–organizing map (SOM) network and will include a new SOM visualizer that allows users to see and modify the system’s evaluation of their regions of interest.
The focus of the programming project will be a scraper for gathering tagged URLs from del.icio.us, a visualizer, and the recommender. The SOM network code will be based on existing implementations, and two recommenders will be built, to compare the relative quality of the recommendations: with one using a single map for URLs and users and the other using separate maps.
For those who are interested, the full project proposal is here with a short specification and design document, as well as my proposed schedule — that I’m not too far off of, so far
.
14 Jun
Well, I’ve started taking the “MS Thesis and Project Seminar” course at RIT, which is the first step towards registering my Thesis (or project) with the school and finishing off my degree!
I’m currently nursing two main ideas for projects related to Artificial Intelligence, and this past week I met with Jessica Bayliss to discuss the second idea and started work on a independent study focusing on refining this second idea and determining whether it’s Thesis material or not. The majority of what I did this week, and plan to do over the next week is researching the existing research in the area of Self-Organizing Maps and classifiers, including reviewing my previous research to refresh my memory.
I’ve also created draft documentation of both ideas as posts on my blog. I’ve linked to them below, but these posts posts will change a lot over the next couple of months, and are marked “private” and thus not accessible to the general public … sign up on the front page and drop me a line if you’re really interested.