![]() This works fine with a few wildcards, but it’s not very efficient. Var expandedTags = new HashSet () foreach ( var wildcard in wildcardsToExpand ) ( IsActualMatch(.) is a simple method that does a basic string StartsWith, EndsWith or Contains as appropriate) loop through the wildcards and compare each one with every single tag to see if it could be expanded to match that tag. Now a simple way of doing these matches is the following, i.e. If you want to see the wildcard expansion in action you can visit the url’s below: There are 6,428,251 questions (out of 7,990,787) that have at least one of the 7,677 tags in them!.The tags and wildcards expand to 7,677 tags in total (out of a possible 30,529 tags).It contains 3,753 items, of which 210 are wildcards (e.g. ![]() You’ll need to scroll across to appreciate this full extent of this list, but here’s some statistics to help you: Now most people probably have just a few exclusions and maybe 10’s at most, but fortunately a Stack Overflow power-user got in touch with me and shared his list of preferences. If that happens, you get this message: (it can also be configured so that matching questions are greyed out instead): Note: it will let you know if there were questions excluded due to your preferences, which is a pretty nice user-experience. Then when you do a search, it will exclude these questions from the results. tags that you don’t want to see questions for. These exclusions are configurable and allow you to set “Ignored Tags”, i.e. What is he talking about here? Well any time you do a tag search, after the actual search has been done per-user exclusions can then be applied. But the real Tag Engine does much more than that, for instance: a basic search for all the questions that contain a given tag, along with multiple sort orders (by score, view count, etc). In part 1, I only really covered the simple things, i.e. It’s a nice way of being able to cope with surges in demand or busy times of the day. As you can see they run the Tag Engine on some pretty powerful servers, but only have a peak CPU usage of 10%, which means there’s plenty of overhead available. Since the first part was published, Stack Overflow published a nice performance report, giving some more stats on the Tag Engine Servers. This is the long-delayed part 2 of a mini-series looking at what it might take to build the Stack Overflow Tag Engine, if you haven’t read part 1, I recommend reading it first. There’s also a video available of my NDC London 2014 talk “Performance is a Feature!”. ![]() I’ve added a Resources and Speaking page to my site, check them out if you want to learn more. The Stack Overflow Tag Engine – Part 2 - 1334 words
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |