Now, there are no remaining speed issues - with the slowest query on my laptop taking less than a second. And the whole page return in 1.5 to 2.5 seconds. This is still pretty poor performance compared to some searching. I'm not using the SPARQL views or elaborate caching, and I haven't tried YARS yet, although Sesame in Memory seems to be doing quite well.
With more time, I'll spend time squeezing more speed out of this thing. But first:
There are some issues about the correctness of subcategories, and other minor issues, but the hardest part is now complete.
Here is how counting the number of search results within a particular category works:
Since queries are done over RPC, there can't be internal object members storing data, so I am using a small apc cache with a short life span.
Session 1 begins
1. The main query is sent giving a list of all object ids matching the SPARQL query.
2. the list of object ids is stored in the cache for that session
3. the LIMIT/OFFSET amount of complete object data is captured in a "subquery".
Session 2-N
1. The category query for displaying facets get's a list of all the categorries at the depth the user is browsing
2. For each category a list of object IDs is retrieved (this is where I need to do inference to get objects belonging to subclasses. Drupal doesn't do this by default in taxonomy, since it's resource intense)
3. an array from session 1 is retrieved from apc
3. Then I do an array_intersect() on the array from session 1 and do a count() on that
This lets me do an inexpensive array_intersect on a cached search result instead of having to run Session 2-N as a full query.