Serverless plumbing: Connecting a Nextcloud file store to Knative

In a previous post, we showed how it’s possible to trigger a Knative service when a database update occurs using the Debezium Kafka Connect plug-in connected to Knative; here, we continue this work by describing how we connected a Nextcloud file storage service to Knative, triggering a Knative service/function when a file is uploaded to Nextcloud.

Serverless Platform with Database and Filestore integration

When starting this work, we investigated the Nextcloud API mechanisms as a potential means to support this: while they provide lots of useful functionality, they largely operate within an authenticated realm and as such, an approach which leverages such APIs would be limited to that realm. For our initial work, this was not what we wanted – we wanted to be able to trigger events based on any updates to the Nextcloud file store.

We then happened to look at the database underpinning Nextcloud and observed that this contains a table called oc_activity which tracks Nextcloud activity: all file uploads, deletions, folder creations etc are recorded in this database together with timestamp, user and other related information. This was the perfect source to use to trigger Knative functions; further, as it was based on a standard database, it could leverage our earlier Knative-database integration work.

The most basic integration was performed by creating a new Debezium connector to the Nextcloud database; using this, it was straightforward to obtain a CloudEvent containing information from Nextcloud in a Knative service. As such, we thought that this solution for Nextcloud integration – albeit one with shortcomings discussed below – facilitated easy interactions between Nextcloud and Knative. This was not, alas, the complete story.

To progress further, we wanted to develop a simple application which could add a digital timestamp to a pdf document when one was dropped into an Uploads directory. This was implemented as a small Go program which handled the incoming CloudEvent and triggered a small bash script which leveraged pdfstamp, imagemagick convert and qpdf to timestamp the document. Testing this workflow on a local machine was straightforward.

We built a container based on the standard Ubuntu image and deployed it to Knative via dockerhub. The new Knative service reacted to the events coming from Nextcloud immediately, but we observed that the service was unable to communicate with the Nextcloud instance to download the file, add the timestamp and upload it again.

This was difficult to troubleshoot, mostly down to our lack of deep understanding of the Knative platform. Istio was at the heart of the issue – there was an istio sidecar automatically deployed with the Knative service which policed network activity. The default configuration of Knative is extremely restrictive regarding network access with all external access blocked by default. We were able to relax this restriction by whitelisting IP address ranges via changes to the config-network ConfigMap. This still did not facilitate basic HTTP access to external services – we needed to modify the configuration of Knative to permit access to any external services by changing Istio configuration to permit access to any external services. At this point, it was possible to curl down arbitrary endpoints and we thought we were out of the woods.

However, when we switched back to our simple application, we were still unable to interact with Nextcloud – the sidecar was still blocking requests. This is likely due to the istio-proxy  performing HTTP based filtering most probably on requests containing authentication headers.

Having had so many difficulties with Istio, we decided to disable the use of istio sidecar injection in the namespace. For sure, this is not best practice, but the small application worked once this change was made and we will proceed in this manner until we have a better understanding of how to configure Istio to support such services.

As with our database integration, this is a reasonable initial experiment which indicates how a complete solution may look. Clearly, this work does not address multiple security aspects which arise – disabling Istio is problematic, access rights between filestore and Knative is not considered sufficiently, access to the filestore activity channel is not controlled etc. We also observed cases in which some events are not delivered – usually when a serverless container is starting up – and cases in which events are delivered in batches although we need to investigate these in more detail. Known issues associated with serverless function startup time were observed which are important generally, although less critical for this particular toy application which is inherently asynchronous.

We’ve still got lots of work to do to come up with a more complete system design…more soon…


Leave a Reply

Your email address will not be published. Required fields are marked *