Found a small error in checking subscriptionsTuesday, March 4th, 2008I found a small error in how subscriptions were checked for updates today. A list of subscriptions that required updating was selected from the database. Initially there was no sorting placed on this as it was assumed that all items would be checked. However, as the number of subscriptions has ... |
||
Archive for March, 2008
Another update to duplicate detectionTuesday, March 4th, 2008So after letting the latest experiment run for awhile it's become apparent that while the new duplicate detection is better than the last setup, we're still not where we need to be. Originally ReadPath used a duplicate detection system based on shingle comparisons of the stories. This system was incredibly effective ... |
||
Changes to Duplicate detectionMonday, March 3rd, 2008This weekend I also loosened the requirements for duplicate detection. It seems that we were being a bit too agressive. I'll let the new rules run for awhile and see how it compares. |
||
Impact of URL indexMonday, March 3rd, 2008As part of the performance changes for the site, I changed the way that urls are stored and looked up. This had been done in the database, but when ReadPath reached 12 million items stored, the memory required to maintain that index got to over 3Gb. There are two other ... |
||
Performance UpdateMonday, March 3rd, 2008I spent most of the weekend working on performance updates. Several new systems have been put into place to reduce overall load on the site. Readpath is now archiving items older than two months to a separate system. For users, there shouldn't be any visible difference, but since the vast ... |
||
