When dynamically generating SPARQL for SESAME - I seemed to have gained .5 second by adding filters to the end of the statement. Is SPARQL executed in SESAME in a way that this would make sense?
See the discussion at : http://www.openrdf.org/forum/mvnforum/viewthread?thread=1430
Sparql views are a caching mechanism for prepared queries - a simple mechanism for storing subgraphs to make the queries faster. It can uses more resources (HD space and connection resources) while improving speed.
Here are results of running the scripts against the database of 90,000 statements - 34 queries, in about .3 seconds.
It retrieves all distinct properties, then loops through and retrieves all distinct values for those properties.
The punchline : after a number of tests, prepared queries at this point are only slightly faster. I may cache the tuple query in quercus APC - but for now, the regular queries are sufficiently fast.
COUNT of 33 DISTINCT TOTAL PROPERTIES
Looks like I'll be using a hybrid approach of the HTTP client, and a custom class running on JAVA to do the aggregation emulation.
Here's a recap of the consequences.
The first points of failure are (since the HTTP client for Sesame works great):
* SPARQL/Sesame not having Aggregate functions
* Sesame not having ORDER BY
This produces large amounts of results and/or queries, which then need to be parsed by JSON, leading to the second point of failure
* Zend JSON and native php 5.2 JSON are not fast enough (perhaps they should not be expected to be for 6000 results)
Sesame 2.0 is 2-10 times as fast for the queries I perform than ARC. However, Sesame 2.0 SPARQL doesn't have some features I need, like ORDER BY and full support for OPTIONAL. If these come through soon, I'm going to still need to build a bridge between Drupal/PHP and Sesame. I'll either need: