EntryScape Pipelines¶
See the README in the EntryScape Pipelines Git repository for information on how to checkout and build the application. Instructions on how to interact with the application by using the CLI interface are also provided, as well as how to utilize the harvester daemon.
EntryScape Pipelines serves as an extract, transform, load (ETL) backend solution for harvesting metadata. A pipeline consist of transforms which performs various actions such as checking, fetching, validating and merging data. Depending on the data source that is to be harvested, a pre-built pipeline may be chosen, or a custom pipeline may be pieced together. When created, these pipelines can then be controlled using EntryScape Registry).
The primary focus of the repository is harvesting, which includes: updating and deleting entries in EntryStore, supporting various protocols and formats, and lastly sending notifications in case of failure. Around this primary focus point, there exists an array of actions that can be evoked: statistics can be generated, exports can be made, links and API's can be checked, data can be cached and sitemaps can be generated.