We often receive requests to install the Open PHACTS system on a local server. At present we do not have a "simple" mechanism to do this - it is something we are hoping to generate in the future. In the meantime, the options are outlined below.
Before we get started... A note on architecture
The Open PHACTS architecture is shown here. It is important to note that the system is not just a triple store. Of course, a significant amount of data (>2 billion triples and increasing) is loaded into the Virtuoso triple store. However, the mappings between different database identifiers are not stored here. Mappings are held in a separate system called the Identifier Mapping Service (IMS) and delivered through a web service. A software component known as the workflow engine (implemented in the Open PHACTS version of Puelia) queries the IMS at query-time and injects mappings into the sparql query, which is then executed. This means that there are no database cross-references held in the triple store and to join the data you need the IMS web service component also. Thus,
- If you wish to install & query the Open PHACTS data locally, you need a minimum of the cache (triple store), the IMS and the workflow engine
- Or you need to implement your own methodology for performing the mapping (e.g. by downloading the mappings separately and inserting them in the triple store).
If you are interested in benchmarking it may be more helpful to see the paper On the Formulation of Performant SPARQL Queries, which provides links to a dataset similar to those used in Open PHACTS and corresponding queries to those used for Open PHACTS.
- Obtain a "clone" of the system from our hosting provider. They will copy the entire instance as VM ready for your use. This is the simplest method but comes with a significant cost
- Download the software and data from our github repository. Unfortunately installing the system can be complex and we are unable to offer support to installing it locally at present
- Download the data via our API and load these into your own triple store. Much of our data has gone through extensive processing and represents a valuable resource for data mining.
- Let us know! The more people that want an installable package, the higher up the priority list we can push it.