Cloud-Native Document Management

Document management is an established software-powered business domain. As with most software applications, an ongoing trend is to move the functionality of document management systems (DMS) and related functionality (Enterprise Content Management – ECM, Content Services – CS) into well-defined services, primarily in cloud environments, resulting in cloud-native document management systems/services (CNDMS).

In the context of the research initiative on cloud-native applications and the ARKIS project within the Service Prototyping Lab, we have been looking deeper into the issues surrounding cloud-native document management and have built a prototype implementation to test-drive any ideas and new concepts. This post introduces the software and the challenges already solved with it.

Moving document management into the cloud raises important questions about rules, procedures and technologies. Some of the challenges are:

  • How to deploy, run and operate the entire cloud-native application. Cloud application stacks evolve as do stack-specific languages to describe applications. The usage patterns are diverse, from batch document scans to full-text search, and the multi-tenancy and permission models are complex.
  • Where to store and manage the metadata as well as the data. The obvious candidates are self-hosted (i.e. application-controlled) database services and provider-operated equivalents, each with their own shortcomings concerning strict compliance and auditing. We have previously introduced CNDBbench to get more insights and continue to do so by improving and extending the tool. Another challenge is to define a performing, secure and still affordable isolation of tenants.
  • How to bill the users per use. This requires domain-specific metrics and business rules such as cost associated to document pages.
  • How to establish an ecosystem around the core service. In document management, many specialised services around physical and digital documents exist and benefit from a uniformly specified service interface.

A CNDMS needs to be designed to answer these questions with satisfaction. Our design is the result of several studies and experiments in this direction.

The mentioned design has been prototypically implemented as a composite CNDMS application consisting of web frontend, processing backend and state handling parts. Each microservice is implemented as a container. All containers are orchestrated in various languages which we have covered before, for instance Vamp Blueprints and Kubernetes generated from Docker-Compose, and which we will cover soon, for instance Docker-Swarm. For simplicity reasons, the software is named after the project and versioned in incremental steps corresponding to the development iterations. Thus, ARKIS 2.5 is the latest prototype and incorporates the functionality to manage document metadata with various database server, service and multi-tenancy options. The following screenshot gives a first impression of the functionality.

Future work will focus on differentiated storage, more options in terms of implementation languages for the microservices, and additional large-scale experiments.

4 thoughts on “Cloud-Native Document Management

  1. July.23.2018
    Disclaimer: I am learning micro-services deployment on K8s. I am new to micro-services and K8s.
    Hello,
    I followed all the steps to deploy ARKIS prototype from Github. I chose the non-persistence option. I have set up the k8s cluster based on the shell scripts.
    Now, I want to check if everything is fine.
    Tutorial/Use indicates that I should be able to login and run. However, I don’t know what is the IP:Port that I need to point my browser to start the Login service.

    Please point me to next steps here.

    Thanks!
    Yuva

  2. Have a look at the “frontend microservices” section of the instructions in README.md. It should work on HOST:32001 where HOST is determined and informed by Google Cloud.

  3. Hello Josef,
    Thanks for your prompt response.

    To understand how to use Kubernetes, I worked through a sample https://cloud.google.com/kubernetes-engine/. I have summarized the steps below [1] .

    I would like to map these steps to ARKIS micro-services. Let me know if my understanding below is right:
    a) Deployment of app from Google’s repository –> I assume that for this project, we need to use https://hub.docker.com/u/chumbo/ to deploy an existing binary. Am I right?
    b) Open ports –> I thought this should be a manual step I need to do on my kubernetes cluster. From what I read above, HOST:32001 should be available. How can I make sure that this is the case?
    c) Check availability –> Using the right URL should get me to the right page. Would step (b) above help here?

    Since this is my first time, I need a little more help here. Thanks for your time.

    Regards,
    Yuvaraj
    [1]
    Step 1: CREATE A CLUSTER
    gcloud container clusters create k0
    Step 2: SPIN UP A DEPLOYMENT FROM GOOGLE CONTAINER REGISTRY
    kubectl run app –image gcr.io/google-samples/hello-app:1.0
    Step 3: SCALE DEPLOYMENT
    kubectl scale deployment app –replicas 3
    Step 4: OPEN PORTS
    kubectl expose deployment app –port 80 –type=LoadBalancer
    Step 5: CONFIRM DEPLOYMENT
    kubectl get service app
    Step 6: CONFIRM AVAILABILITY
    curl http://203.0.113.105:80

Leave a Reply

Your email address will not be published. Required fields are marked *