This is a wiki for the Amherst College Digital Collections repository. Its primary purpose is to describe some of the implementation details of the various components that make the site function. Any specific questions can be sent to Aaron Coburn.
Content Models - How our data is modeled
Personal Collections - Settings for how users' personal collections work
Access Controls - Users and groups, including which collections/datastreams each should be able to access.
External API - providing external access to fedora's RESTful API.
Search Settings - Specific settings related to how the searching and indexing works
External API - providing external access to Solr's API.
Document Cache - Riak is a distributed, high-performance key-value (NoSQL) store that supports parallel map-reduce queries. Most user queries retrieve data directly from Riak rather than accessing Fedora, making the entire system significantly faster and more fault-tolerant.
Fedora has an embedded ActiveMQ messaging system that is highly configurable. This means that, when a fedora object changes, any number of related systems can be alerted to that change asynchronously: all with no blocking, waiting or other synchronization going on that would cause the system to slow down or become unresponsive for users. This also means that the system will be eventually consistent (there are a few seconds of lag built in by design), but it also means that the different components can easily be distributed across multiple hosts.
Broker Configuration - Our fedora messaging broker is linked to a remote broker cluster for higher availability. Furthermore, instead of publishing messages on ActiveMQ topics, they are pushed onto queues. This means that the message routing application will never miss a message.
Message Routing - Message routing is handled by Apache Camel, which means that I no longer write any code when integrating different components over the ActiveMQ messaging system.
Routing Container - Camel can run in any JAVA container, but I chose Karaf, because it makes deployment extremely easy.
Cool URLs - making the URLs look friendly and platform-agnostic, all while remaining persistent and bookmarkable.
API JSON Structure - the expected JSON structure for search queries
Namespaces - namespaces in use throughout the repository
Displaying Metadata - metadata fields
The entire site uses RDFa to publish schema.org attributes. We have developed a mapping from standard MODS metadata to populate these attributes in the HTML.
We are also using the OpenGraph protocol, primarily for better integration with social media sites, such as Facebook, Google+, Twitter, etc.
RDF - Details on the ontologies in use
Stanbol Entityhub - We are using a local Stanbol entityhub with LC authority records, allowing us to fix and/or enhance the existing MODS metadata.
Fedora4 is the next version of the Fedora repository software. Our system currently uses the Fedora3 series, but we are actively experimenting with Fedora4.
Messaging - Details on messaging with Fedora4