[OpenWhisk] Rolling, Action! (part 3)

Now that we have a working OpenWhisk instance, let’s make actions in swift. I chose a simple chain that highlights the 3 things I wanted to showcase, but I’m keenly aware that it’s artificial in many ways. It’s just a demo. The whole thing will be available at the end of the series.

The Premise

I want to have a process that reads a blog’s jsonfeed, and every time it’s updated, something checks for broken links in the items of the blog. In the first post, we’ll look at how to make 3 different actions, that will ultimately be chained together.

Preempting:

“You could have done all 3 in one action”

Yes, of course. I wanted to display 3 different ways of building the actions because of the various pitfalls they exhibit, and because I want a chain.

Action 1 will take a URL, and check list all the links it contains, Action 2 takes a list of URLs and outputs a status dictionary with the status of each of them, and Action 3 takes a status dictionary and sends a mail with the list of broken links as well as a CSV containing the status of all the URLs. Can you spot the chain? ;)

Link Status

Let’s start with the simplest one: it takes a list of URLs as input, and outputs a status dictionary that looks like this:

{ "status" : [
{ "url" : "url1", "broken" : false },
{ "url" : "url1", "broken" : true }
]
}

No need for anything fancy, it just serially checks if the page loads with a status < 300. Nope, I don’t want redirects either. Again, this is a stupidly simple example to showcase OpenWhisk, not the best practice blog for Swift.

According to the docs, an action is something that takes a json object as an input and spits a json object as the output. So, here goes:

import Foundation
import Dispatch
 
func main(args: [String:Any]) -> [String:Any] {
    var result = [ [String:Any] ]()
    if let urlStrings = args["urls"] as? [String] {
 
        let config = URLSessionConfiguration.default
        let session = URLSession(configuration: config)
 
        for link in urlStrings {
            if let url = URL(string: link) {
                let semaphore = DispatchSemaphore(value: 0)
 
                session.dataTask(with: url) { data, response, error in 
                    if let httpResponse = response as? HTTPURLResponse {
                        result.append( [ "url" : link, "broken" : (httpResponse.statusCode >= 300)] )
                    } else if nil != error { 
                        result.append( [ "url" : link, "broken" : true] )
                    } else {
                        print("not an HTTP url")
                    }
 
                    semaphore.signal()
                }.resume()
 
                _ = semaphore.wait(timeout: DispatchTime.distantFuture)
            } else {
                print("invalid link")
            }
        }
    } 
 
    return [ "status" : result ]
}

The next step is adding the action to my list: wsk action create LinkStatus LinkStatus.swift --kind=swift:3.1.1, and it replies ok: created action LinkStatus. Sweet.

Now to test it: wsk action invoke -r LinkStatus -p urls '[ "https://blog.krugazor.eu", "https://blog.krugazor.com" ]' outputs

{
    "status": [
        {
            "broken": false,
            "url": "https://blog.krugazor.eu"
        },
        {
            "broken": true,
            "url": "https://blog.krugazor.com"
        }
    ]
}

😎

warning: since it has to compile it the first time (you’re uploading a code file rather than a binary, after all), it may take a while. The subsequent calls will be a lot faster

Identify Links

Identifying links is easy, right? parse the html, extract the links, outputs the links. Except that swift on linux doesn’t do that well, and you can’t use the standard swift:3.1.1 image for that:

  • XMLDocument for HTML works fine on MacOS, but spits a ton of errors on Linux
  • the swift:3.1.1 image doesn’t have libxml2-dev, which prevents me from using a 3rd party package like Kanna

So what’s the plan here?

Remember, under the hood, these things are containers, so there should be a way to make a container that works the same way but includes libxml2-dev and allows me to use Kanna.

One way you can go about it is to take the dockerfile for the standard action, add the new dependency and use that image for your actions. I’ll leave it as an exercise for the reader, because I went for a more complex way to do it. Again, this is about showcasing pitfalls and potential solutions.

The idea is ostensibly the same, but the docker image will be local only and precompiled. In practice you won’t be able to use it by uploading a new main swift file to replace the current one like with the standard image, it’s a one-shot.

The swift code itself will be available later, but it’s not really that interesting: the Package file now includes Kanna, and the main code is maybe 25 lines long. What’s important is the build process.

OpenWhisk will pull docker images from the hub if you use --docker instead of --kind, but I don’t want to have my images up there. Luckily, if an image is in the whisk namespace, the local version is used. So, all we need to do is build an image that has my dependency and my code

UPDATE: Rodric Rabbah underlines the fact that this trick with whisk/xxx images being local instead of pulled from somewhere only works in local deployments

Because we don’t want the code to be updatable, I also changed the actionproxy.py file to just return a 200 code on /init. Armed with this modified file, my Package and main swift files, and the rest of the regular scaffolding, the Dockerfile looks like this:

# Dockerfile for swift actions, overrides and extends ActionRunner from actionProxy
# This Dockerfile is partially based on: https://github.com/IBM-Swift/swift-ubuntu-docker/blob/master/swift-development/Dockerfile
FROM ibmcom/swift-ubuntu:3.1.1
 
# Set WORKDIR
WORKDIR /
 
# Upgrade and install basic Python dependencies
RUN apt-get -y update \
 && apt-get -y install --fix-missing python2.7 python-gevent python-flask zip libxml2-dev
 
# Add the action proxy
RUN mkdir -p /actionProxy
ADD invoke.py /actionProxy
 
# Add files needed to build and run action
RUN mkdir -p /swiftAction
ADD epilogue.swift /swiftAction
ADD buildandrecord.py /swiftAction
ADD swiftrunner.py /swiftAction
# ADD spm-build /swiftAction/spm-build
RUN mkdir -p /swiftAction/spm-build
ADD _Whisk.swift /swiftAction/spm-build
ADD _WhiskJSONUtils.swift /swiftAction/spm-build
 
# Our stuff
COPY Package.swift /swiftAction/spm-build/
COPY main.swift /swiftAction/spm-build/
COPY actionproxy.py /actionProxy/
 
 
# Build Action
RUN touch /swiftAction/spm-build/main.swift
# RUN cd /swiftAction/spm-build; swift package update
# RUN cd /swiftAction/spm-build; swift build -v -c release
RUN python /swiftAction/buildandrecord.py
 
ENV FLASK_PROXY_PORT 8080
 
CMD ["/bin/bash", "-c", "cd /swiftAction && PYTHONIOENCODING='utf-8' python -u swiftrunner.py"]

I now need to build it so that OpenWhisk can use it, and create the action:

docker build . -t whisk/identifylinks:latest
wsk action create identifylinks --docker whisk/identifylinks
wsk action invoke identifylinks -p url https://www.krugazor.eu -r

and the result?

{
    "url": "https://www.krugazor.eu",
    "urls": [
        "styles.css",
        "#article1",
        "#article2",
        "#article3",
        "#",
        "http://twitter.com/krugazor",
        "http://blog.krugazor.eu",
        "#",
        "mailto:zino@krugazor.eu",
        "#",
        "http://krugazor.eu/highlight",
        "http://krugazor.free.fr/software/desinstaller",
        "http://wiki.hudson-ci.org",
        "http://krugazor.free.fr/software/alphabeta",
        "#"
    ]
}

Mailer

Last, but not least, I’d like to make a CSV and a mail to report those broken links. There is Swift-SMTP, but it requires swift 4. Dang it. Ah but digging around in the ibmcom/swift-ubuntu catalog, there is a 4.0.3 version. So I’ll just use this instead of the 3.1.1 variant, right?

Close, but no cigar: the way the swiftrunner works, it looks at the output of the build process for validation. The build command fails because it thinks it hasn’t built it properly. Yes, I know I could have just removed it entirely, but this is an incremental process to understand how the images work. So, I needed to patch the python build script like this:

27c27
< LINKER_SUBSTRING =  "-module-name Action -emit-executable -Xlinker"
---
> LINKER_PREFIX =  "/usr/bin/swiftc -Xlinker '-rpath=$ORIGIN' '-L/swiftAction/spm-build/.build/release' -o '/swiftAction/spm-build/.build/release/Action'"
50c50
< elif instruction.find(LINKER_SUBSTRING) > -1:
---
>     elif instruction.startswith(LINKER_PREFIX):

And presto! It now works like the previous case, with swift 4 (and additional libs if I wanted to).

There is another problem I need to address: I don’t want to hardcode the smtp credentials, and there is no obvious mechanism for storing secrets. I just decided in the end to include a json configuration file in the image, because it’s local, so unless an attacker has access to my server (in which case they don’t need the credentials anymore), it’s as safe as can be, and I can still maintain it outside of the code. It’s not a perfect solution by any stretch of the imagination, but it works.

The final Dockerfile looks like this:

FROM ibmcom/swift-ubuntu:4.0.3
 
# Set WORKDIR
WORKDIR /
 
# Upgrade and install basic Python dependencies
RUN apt-get -y update \
 && apt-get -y install --fix-missing python2.7 python-gevent python-flask zip
 
# Add the action proxy
RUN mkdir -p /actionProxy
ADD invoke.py /actionProxy
 
# Add files needed to build and run action
RUN mkdir -p /swiftAction
ADD epilogue.swift /swiftAction
ADD buildandrecord.py /swiftAction
ADD swiftrunner.py /swiftAction
# ADD spm-build /swiftAction/spm-build
RUN mkdir -p /swiftAction/spm-build
ADD _Whisk.swift /swiftAction/spm-build
ADD _WhiskJSONUtils.swift /swiftAction/spm-build
 
# Our stuff
COPY Package.swift /swiftAction/spm-build/
COPY main.swift /swiftAction/spm-build/
COPY config.json /swiftAction/
COPY actionproxy.py /actionProxy/
 
 
# Build Action
RUN touch /swiftAction/spm-build/main.swift
# RUN cd /swiftAction/spm-build; swift package update
# RUN cd /swiftAction/spm-build; swift build -v -c release
RUN python /swiftAction/buildandrecord.py
 
ENV FLASK_PROXY_PORT 8080
 
CMD ["/bin/bash", "-c", "cd /swiftAction && PYTHONIOENCODING='utf-8' python -u swiftrunner.py"]

Build, install, and run!

docker build . -t whisk/summary:latest
wsk action create linkssummary --docker whisk/summary
wsk action invoke -r linkssummary -p status '[
        {
            "broken": false,
            "url": "https://blog.krugazor.eu"
        },
        {
            "broken": true,
            "url": "https://blog.krugazor.com"
        }
    ]
'

And?

{
    "success": true
}

Chaining them

I obviously built the actions so that each one outputs what the next one expects in terms of format. So now, all I have to do is create a sequence:
wsk action create checkChain --sequence identifylinks,LinkStatus,linkssummary

And now, if I invoke it, it will pass my param to the first one, then the output of that to the second one, and the output of that to the last one. I used an asynchronous calls in order to show some basic debugging techniques:

$ wsk action invoke checkChain -p url http://www.krugazor.eu
ok: invoked /_/checkChain with id 4c79cc927f7c4a7fb9cc927f7cba7fa0
$ wsk activation logs 4c79cc927f7c4a7fb9cc927f7cba7fa0
80e722003ee4468ca722003ee4e68c35
7640c722c8434cac80c722c8437cace8
007f3b1ee48548f8bf3b1ee48508f8c9
$ wsk activation result 80e722003ee4468ca722003ee4e68c35
{
    "url": "http://www.krugazor.eu",
    "urls": [
        "styles.css",
...
}
$ wsk activation result 7640c722c8434cac80c722c8437cace8
{
    "status": [
        {
            "broken": true,
            "url": "styles.css"
        },
        {
            "broken": false,
            "url": "http://blog.krugazor.eu"
        },
        {
            "broken": true,
            "url": "#"
        },
...
}
$ wsk activation result 7640c722c8434cac80c722c8437cace8
{
    "success": true
}

Final thoughts on the sequence

Of course, this isn’t a production thing, it’s a demo thing. We could have used a single action in swift 4 that does all that, but the idea was to show the evolution of the action from the “standard” way of operating a swift 3.1.1 action, to a small hack to have it include an external dependency, to swift 4 + dependencies.

The link identification could be better, as well, removing the partial URLs or completing them to make them whole again, and all sorts of badly coded things. But I hope you can see that if you plan your actions well, they work like functions: you can reuse some of them if various sequences, or have several kinds of sequences that skip parts of the actions etc etc.

Next time, we’ll look at how to trigger that sequence other than manually.