Skip to content

IDLabResearch/eu-op-vocab-feed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EU Publications Office vocabulary feed

This repository contains the architectural configuration to produce and publish a Linked Data Event Stream (LDES) containing a feed of changes for a given and configurable controlled vocabulary, as the ones managed by the EU Publications Office.

The vocabulary changes are modelled using the W3C Activity Streams 2 vocabulary.

The data processing workflow is built as an RDF-Connect pipeline that performs several data transformation steps, which include:

  • Raw vocabulary fetching over HTTP
  • Change detection and semantic labeling with Activity Streams 2
  • Fragmentation based on temporal constraints
  • Ingestion into a target data store system

The publishing is done via an instance of the ldes-server, which sits on top of the data store used by the RDF-Connect pipeline to write the data.

System components and architecture

TODO: Diagram and description of pipeline components.

How to run it?

To run the pipeline locally, you need to make sure all the required components are up and running. These include:

  • A Redis or MongoDB instance (see /datastore for more information)
  • An instance of the ldes-server (see /ldes-server for more information)
  • Optionally, an Varnish instance for caching (see /varnish for more information)

Next, you need to configure all the environment variables in the conf.env file according to your local setup.

Finally, you can an execution loop of the pipeline, that will fetch all versions of a given vocabulary (see run.sh) with:

./run.sh 

With Docker

This pipeline and the necessary data storage and interface components are containerized using Docker and can be executed altogether using docker-compose as follows:

$ docker-compose up --build 

The conf.env file contains the main configuration variables to be set.

About

RDF-Connect pipeline setup for publishing an LDES of controlled vocabulary terms from the EU Publications Office

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors