Skip to content

Walkthrough

This walkthrough will show how to serve an application using SSF and deploy it on Gcore to use IPUs. As a prerequisite we need to follow the installation instructions to install and enable the Poplar SDK with Poptorch in the environment.

Select a model

For this example we deploy a pre-trained question answering model from Huggingface. Distilbert-base-cased-distilled-squad will do the trick 🤗
The model itself can be imported from the optimum-graphcore library as an inference pipeline:

from optimum.graphcore import pipeline

question_answerer = pipeline(
    "question-answering", model="distilbert-base-cased-distilled-squad"
)

Note that the input is a dictionary containing question and context strings. The output is also a dictionary containing an answer string, the score, and the start and end positions of the answer in the context string.

Implement the application interface

To interface our model with SSF we need to implement the application interface SSFApplicationInterface. The following file my_app.py shows the code needed for this:

from optimum.graphcore import pipeline
import logging
from ssf.application_interface.application import SSFApplicationInterface
from ssf.application_interface.utils import get_ipu_count
from ssf.application_interface.results import RESULT_OK, RESULT_APPLICATION_ERROR

logger = logging.getLogger()


class MyApplication(SSFApplicationInterface):
    def __init__(self):
        self.question_answerer: pipeline = None
        self.dummy_inputs_dict = {
            "question": "What is your name?",
            "context": "My name is Rob.",
        }


    def build(self) -> int:
        if get_ipu_count() >= 2:
            logger.info("Compiling model...")
            build_pipeline = pipeline(
                "question-answering", model="distilbert-base-cased-distilled-squad"
            )
            build_pipeline(self.dummy_inputs_dict)
        else:
            logger.info(
                "IPU requirements not met on this device, skipping compilation."
            )
        return RESULT_OK


    def startup(self) -> int:
        logger.info("App started")
        self.question_answerer = pipeline(
            "question-answering", model="distilbert-base-cased-distilled-squad"
        )
        self.question_answerer(self.dummy_inputs_dict)
        return RESULT_OK


    def request(self, params: dict, meta: dict) -> dict:
        result = self.question_answerer(params)
        return result


    def shutdown(self) -> int:
        return RESULT_OK


    def watchdog(self) -> int:
        result = self.question_answerer(self.dummy_inputs_dict)
        return RESULT_OK if result["answer"] == "Rob" else RESULT_APPLICATION_ERROR

Now let's explain this step-by-step.

SSF will serve an instance of MyApplication. To implement the interface we need to define the 5 methods build, startup, request, shutdown and watchdog:

  • In the __init__ method we define a placeholder for the question_answerer. We also define a dummy input dictionary that will be used to test the pipeline.
class MyApplication(SSFApplicationInterface):
    def __init__(self):
        self.question_answerer: pipeline = None
        self.dummy_inputs_dict = {
            "question": "What is your name?",
            "context": "My name is Rob.",
        }
  • The build method is called when issuing gc-ssf build. It should contain any preliminary steps that we want to happen offline, before running the server.
    Since we are using IPUs, we can compile the model in advance to save time at server startup.
    To do that, we should call the pipeline object at least once (the first call triggers compilation). The IPU compilation generates a cache exe_cache/, we explain later how to package this cache alongside the server. HuggingFace libraries will also download and cache model weights. We may not have access to IPUs to run the build step outside of our deployment environment - we can check this by using the utility function get_ipu_count. If we don't have access to IPUs it will skip compilation which will then be triggered by startup when deployed.
    Note: we use return RESULT_OK from ssf return codes, this is equivalent to return 0
    def build(self) -> int:
        if get_ipu_count() >= 2:
            logger.info("Compiling model...")
            build_pipeline = pipeline(
                "question-answering", model="distilbert-base-cased-distilled-squad"
            )
            build_pipeline(self.dummy_inputs_dict)
        else:
            logger.info(
                "IPU requirements not met on this device, skipping compilation."
            )
        return RESULT_OK
  • The startup method is called every time the server starts (when issuing gc-ssf run) so it can contain any warmup code we need. We instantiate and call the pipeline with dummy inputs: if the compilation cache exists, this first call will have the effect of attaching the model to available IPUs. If not, it will compile it first. Since the lifespan of self.question_answerer is the same as MyApplication, the model will stay attached to the IPUs as long as the MyApplication instance is alive.
    def startup(self) -> int:
        logger.info("App started")
        self.question_answerer = pipeline(
            "question-answering", model="distilbert-base-cased-distilled-squad"
        )
        self.question_answerer(self.dummy_inputs_dict)
        return RESULT_OK
  • The request method is the function executed by our API call. It is important to understand what will be in the dictionaries params(the inputs) and return(the output) as SSF will use it later to generate the API.
    def request(self, params: dict, meta: dict) -> dict:
        result = self.question_answerer(params)
        return result
  • Any resource freeing can be carried out in the shutdown method. We have left it empty:
    def shutdown(self) -> int:
        return RESULT_OK
  • Finally, watchdog will be called periodically by our server while no requests are being issued (see option --watchdog-ready-period in options). If it fails, the server will try to kill and restart MyApplication. As an example we verify that we get an expected output from a known input:
    def watchdog(self) -> int:
        result = self.question_answerer(self.dummy_inputs_dict)
        return RESULT_OK if result["answer"] == "Rob" else RESULT_APPLICATION_ERROR

Write SSF config

The SSF config is the point of contact between our application and SSF. This will define all the metadata, the requirements (such as Python libraries needed for our application, the base Docker image to use, and so on), and also define our API.

The SSF config folder can be considered the primary application folder or context. All files and modules should be specified relative to the SSF config folder. The current working directory will be set to the application module folder before SSF calls any of the application entry points (build or request etc.).

Let's create ssf_config.yaml :

# Copyright (c) 2023 Graphcore Ltd. All rights reserved.
ssf_version: 1.0.0

application:
  id: qa_api
  name: Question Answering API
  desc: A very simple QA API
  version: 1.0
  module: my_app.py
  ipus: 2
  trace: True
  artifacts: [exe_cache/*]
  dependencies:
    python: --find-links https://download.pytorch.org/whl/cpu/torch_stable.html torch==2.0.1+cpu, optimum-graphcore==0.7.1, tokenizers==0.11.1, numpy==1.23.5
    poplar: ["3.3.0"]
    poplar_wheels: poptorch

  package:
    inclusions: [exe_cache/*]
    exclusions: []
    docker:
        baseimage: "graphcore/pytorch:latest"

endpoints:

  - id: QA
    version: 1
    desc: Question answering model
    custom: ~

    inputs:

      - id: context
        type: String
        desc: Context
        example: "The large green ball bounced down the twisty road"

      - id: question
        type: String
        desc: Question
        example: "What colour is the ball?"

    outputs:

      - id: answer
        type: String
        desc: Answer in the text

      - id: score
        type: Float
        desc: Probability score

Now let's explain the main lines:

  • Under application:

module tells us where to find the interface that we have implemented:

module: my_app.py

Since we are using IPUs, let's check the resources used. The distillbert-base IPU config indicates 2 IPUs. With this config line, SSF will verify the system can acquire 2 IPUs when running the command run or test.

ipus: 2

Application dependencies must be declared. This includes required Python packages plus Poplar SDK and wheels if IPU will be used. Our model needs optimum-graphcore with some specific supporting packages plus Poplar 3.3.0 with Poptorch:

dependencies:
    python: --find-links https://download.pytorch.org/whl/cpu/torch_stable.html torch==2.0.1+cpu, optimum-graphcore==0.7.1, tokenizers==0.11.1, numpy==1.23.5
    poplar: ["3.3.0"]
    poplar_wheels: poptorch

The package section refers to the gc-ssf package command, we can edit how we want SSF to build our container. We can include any files used by our application (and exclude some others), glob patterns are supported. Let's include everthing generated in the compilation cache.

    inclusions: [exe_cache/*]
    exclusions: []

Finally we want to use Graphcore's base image with pre-installed PyTorch, so we can run optimum-graphcore without issue:

docker:
        baseimage: "graphcore/pytorch:latest"
  • Under endpoints:

This is how SSF will generate our API.

id: QA
    version: 1

This endpoint path will be v1/QA. Now let's remember our application request(self, params: dict, meta: dict) method. We want to describe here the inputs dictionaries using the names of the keys (context, question) and ssf types.

    inputs:
      - id: context
        type: String
        desc: A context
        example: "The large green ball bounced down the twisty road"

      - id: question
        type: String
        desc: The question
        example: "What colour is the ball?"

We also want to describe the outputs. Notice we are only selecting answer and score from our results as we are not interested in returning the start and end keys.

    outputs:

      - id: answer
        type: String
        desc: Answer in the text

      - id: score
        type: Float
        desc: Probability score

Our application is officially ready! ✨
Now let's see what SSF can do.

Use SSF

We should now have the following file structure:

project_directory/
    - ssf_config.yaml
    - my_app.py

Running the application locally for development and testing

We can use the SSF commands to run our application:

gc-ssf init build run --config ssf_config.yaml

The output should look similar to this:

demo@dev2:~/workspace/ssf$ gc-ssf init build run --config examples/walkthrough/ssf_config.yaml
2023-10-19 12:55:14,483 420941     INFO      > Config /nethome/demmo/workspace/ssf/examples/walkthrough/ssf_config.yaml (cli.py:639)
2023-10-19 12:55:14,490 420941     INFO      application.license_name not specified. Defaulting to 'None' (load_config.py:375)
...
2023-10-19 12:56:22,004 420941     INFO      > Lifespan start (server.py:82)
2023-10-19 12:56:22,004 420941     INFO      Lifespan start : start application (threaded) (server.py:83)
2023-10-19 12:56:22,005 420941     INFO      Application startup complete. (on.py:62)
2023-10-19 12:56:22,006 420941     INFO      Uvicorn running on http://0.0.0.0:8100 (Press CTRL+C to quit) (server.py:219)
...
2023-10-19 12:56:44,676 420941     INFO      Dispatcher ready (dispatcher.py:542)
2023-10-19 12:56:49,493 421639     INFO      [0] Dispatcher polling application replica watchdog (dispatcher.py:242)

We can see the address and port on which the application end-points have been started. In this case localhost (0.0.0.0) and port 8100. We can also see when the dispatcher object through which calls (requests) to the application are made is ready. We will see repeated polls of the application watchdog while the end-point is idle; this is the SSF built-in watchdog feature. See watchdog for further details.

Use a browser to open the endpoint docs with the format http://<address>/docs, for example http://0.0.0.0:8100/docs.

Use CTRL-C to stop the running application.

Tip:
If you are using Visual Studio Code with a remote connection to your development host then you can use the Port forwarding feature to add the served port (8100 in this example) and browse the endpoint directly in your VSC client using the built-in simple browser.

Packaging and Deployment

Once we are satisfied the application is working correctly, we can package and deploy it.

We can use the SSF commands for several different scenarios. First we should decide which commands we want to run locally (on our current machine) and which commands will run on the remote (deployment) machine.

Let's look at a couple of examples.

Example 1 - deployment using an application-specific packaged image

In this first example we build our server locally and deploy its image via Docker Hub. The workflow can be summarised as follows:

(local)-> init, build, package, publish, deploy
(remote)-> run
  • The init and build steps are run locally, to compile the model before packaging.
  • The package step creates the container image locally, and publish pushes it to Docker Hub.
  • Finally deploy sends and executes the deployment script on our deployment target. In the previous step, the container image was packaged in such a way to ensure it executes run when started on the remote machine.

Let's build our container: We use --package-tag to replace the tag from the config with our Docker username since we will push the image to Docker hub.

gc-ssf init build package --config ssf_config.yaml --package-tag <docker-username>/<repo-name>

The output should look similar to this:

demo@dev2:~/workspace/ssf$ gc-ssf init build package --config examples/walkthrough/ssf_config.yaml --package-tag graphcore/cloudsolutions-dev:walkthrough_api
2023-10-19 10:11:17,670 380504     INFO      > Config /nethome/demo/workspace/ssf/examples/walkthrough/ssf_config.yaml (cli.py:639)
2023-10-19 10:11:17,675 380504     INFO      application.license_name not specified. Defaulting to 'None' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.license_url not specified. Defaulting to 'None' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.terms_of_service not specified. Defaulting to 'None' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.startup_timeout not specified. Defaulting to '300' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.package.name not specified. Defaulting to 'qa_api.1.0.tar.gz' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.package.tag not specified. Defaulting to 'qa_api:1.0' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.package.docker.run not specified. Defaulting to '' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.max_batch_size not specified. Defaulting to '1' (load_config.py:375)
2023-10-19 10:11:17,675 380504     INFO      application.syspaths not specified. Defaulting to '[]' (load_config.py:375)
2023-10-19 10:11:17,676 380504     INFO      endpoints.0.http_param_format not specified. Defaulting to 'None' (load_config.py:375)
2023-10-19 10:11:17,676 380504     INFO      endpoints.0.outputs.0.example not specified. Defaulting to 'None' (load_config.py:375)
2023-10-19 10:11:17,676 380504     INFO      endpoints.0.outputs.1.example not specified. Defaulting to 'None' (load_config.py:375)
2023-10-19 10:11:17,676 380504     INFO      Adding syspath /nethome/demo/workspace/ssf/examples/walkthrough (cli.py:683)
2023-10-19 10:11:17,676 380504     INFO      > ==== Init ==== (init.py:17)
2023-10-19 10:11:17,676 380504     INFO      > Cleaning endpoints (init.py:19)
2023-10-19 10:11:17,677 380504     INFO      > Cleaning application (init.py:22)
2023-10-19 10:11:17,678 380504     INFO      Clean /nethome/demo/workspace/ssf/examples/walkthrough/exe_cache/8218824126841776145.popef (init.py:36)
2023-10-19 10:11:17,679 380504     INFO      > ==== Build ==== (build.py:19)
2023-10-19 10:11:17,679 380504     INFO      > Generate_endpoints (build.py:26)
2023-10-19 10:11:17,679 380504     INFO      loading module /nethome/demo/workspace/ssf/ssf/generate_endpoints_fastapi.py with module name generate_endpoints (utils.py:298)
2023-10-19 10:11:17,684 380504     INFO      > Load application (build.py:29)
2023-10-19 10:11:17,684 380504     INFO      Creating application main interface (application.py:160)
2023-10-19 10:11:17,684 380504     INFO      Checking application dependencies (application.py:161)
2023-10-19 10:11:17,684 380504     INFO      installing python packages git+https://github.com/huggingface/optimum-graphcore.git@97c11c3 (utils.py:276)
2023-10-19 10:11:26,230 380504     INFO      Loading qa_api application main interface from /nethome/demo/workspace/ssf/examples/walkthrough/my_app.py with module id qa_api (application.py:168)
2023-10-19 10:11:26,230 380504     INFO      loading module /nethome/demo/workspace/ssf/examples/walkthrough/my_app.py with module name qa_api (utils.py:298)
2023-10-19 10:11:27,765 380504     INFO      Created a temporary directory at /tmp/tmpu3ter_oa (instantiator.py:21)
2023-10-19 10:11:27,766 380504     INFO      Writing /tmp/tmpu3ter_oa/_remote_module_non_scriptable.py (instantiator.py:76)
2023-10-19 10:11:28,553 380504     INFO      <module 'qa_api' from '/nethome/demo/workspace/ssf/examples/walkthrough/my_app.py'> (application.py:172)
2023-10-19 10:11:28,553 380504     INFO      Found <class 'qa_api.MyApplication'>, MyApplication (application.py:225)
2023-10-19 10:11:28,553 380504     INFO      instance=<qa_api.MyApplication object at 0x7f6c3ba5d3a0> (build.py:32)
2023-10-19 10:11:28,553 380504     INFO      > Build application (build.py:34)
2023-10-19 10:11:28,626 380504     INFO      Compiling model... (my_app.py:26)
No padding arguments specified, so padding to 384 by default. Inputs longer than 384 will be truncated.
Graph compilation: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:36<00:00]
2023-10-19 10:12:15,724 380504     INFO      > ==== Package ==== (package.py:51)
2023-10-19 10:12:15,724 380504     INFO      > Packaging qa_api to /nethome/demo/workspace/ssf/.package/qa_api (package.py:71)
2023-10-19 10:12:15,725 380504     INFO      > Package name qa_api.1.0.tar.gz (package.py:72)
2023-10-19 10:12:15,725 380504     INFO      > Package tag graphcore/cloudsolutions-dev:walkthrough_api (package.py:73)
2023-10-19 10:12:15,806 380504     INFO      > Package SSF from /nethome/demo/workspace/ssf/ssf (package.py:111)
2023-10-19 10:12:15,862 380504     INFO      > Package Application from /nethome/demo/workspace/ssf/examples/walkthrough (package.py:169)
2023-10-19 10:12:15,960 380504     INFO      > Package Endpoint files (package.py:185)
2023-10-19 10:12:15,962 380504     INFO      > Gathering pip requirements (package.py:206)
2023-10-19 10:12:15,963 380504     INFO      > Generate container image (package.py:245)
2023-10-19 10:12:15,963 380504     INFO      application.package.docker_run not specified. Defaulting to '' (utils.py:67)
2023-10-19 10:14:31,414 380504     INFO      > Package: (package.py:280)
2023-10-19 10:14:31,414 380504     INFO      > qa_api.1.0.tar.gz (from /nethome/demo/workspace/ssf/.package/qa_api/src) (package.py:281)
2023-10-19 10:14:31,414 380504     INFO      > Test run: 'cd /nethome/demo/workspace/ssf/.package/qa_api/src && ./run.sh' (package.py:282)
2023-10-19 10:14:31,414 380504     INFO      > Docker: (package.py:284)
2023-10-19 10:14:31,414 380504     INFO      > Run: 'docker run --rm -d --network host --name qa_api graphcore/cloudsolutions-dev:walkthrough_api' (package.py:285)
2023-10-19 10:14:31,414 380504     INFO      > Run with IPU devices: 'gc-docker -- --rm -d  --name qa_api graphcore/cloudsolutions-dev:walkthrough_api' (package.py:288)
2023-10-19 10:14:31,414 380504     INFO      > Logs: 'docker logs -f qa_api' (package.py:292)
2023-10-19 10:14:31,414 380504     INFO      > Stop: 'docker stop qa_api' (package.py:293)
2023-10-19 10:14:31,414 380504     INFO      Exit with 0 (cli.py:739)

We should be able to see that a new .package/ directory has been created, it contains the packaged application with name qa_api. This is the packaged source used to build the docker image. We can test or debug the packaged application source locally by moving to the .package/qa_api/src directory and running ./run.sh. This is the same entry-point that will be used when the application docker image is deployed remotely.

We can also verify that our application image was created during the package step with docker, for example:

demo@dev2:~/workspace/ssf$ docker images
REPOSITORY                     TAG                           IMAGE ID       CREATED          SIZE
graphcore/cloudsolutions-dev   walkthrough_api               9e301f5fb62f   7 minutes ago    3.57GB

For the next step, a login to a Docker Hub registry is necessary. SSF will log in temporarily to Docker when using --docker-username and --docker-password options. But if you are already logged in with the correct account, you don't need these options for this step.

We can now publish our image on Docker hub.

Let's specify that we will push it with the same tag as during the package step using --package-tag

gc-ssf publish --config ssf_config.yaml --package-tag <docker-username>/<repo-name> --docker-username <docker-username> --docker-password <token>

Finally, we can deploy the container to Gcore.

We will assume that we have already set up a VM with at least 2 IPUs with the IP address 123.456.789.0 and the default username "ubuntu". To access it, we have a private key. SSF will need this to access the VM and deploy. To pass the key securely we will store it in an env variable and use the option --add-ssh-key. For instance we can set it from a file:

SSH_KEY=$(cat ssh_key_file)  gc-ssf --add-ssh-key SSH_KEY

Now let's run the deploy command with the following set of options. We also pass our Docker token via the option --docker-password which is needed to pull the image from our Docker hub repo to the remote VM (but this is not needed if you use a public repo).

gc-ssf deploy --config ssf_config.yaml --deploy-platform Gcore --port 8100 --deploy-gcore-target-address 123.456.789.0 --deploy-gcore-target-username ubuntu --docker-username <docker-username> --docker-password <token> --package-tag <username>/<repo-name>:latest --deploy-package
  • Notice the use of --deploy-package to specify that we want to deploy the application-specific package that we published previously.

  • You can combine the --add-ssh-key argument with other SSF commands, so we might choose to include it with gc-ssf deploy and use a single invocation of SSF to configure keys and deploy the packaged application image.

With this configuration our API endpoint should be available at http://123.456.789.0:8100/v1/QA.

Since it's using FastAPI you can also test it with Swagger UI under the path http://123.456.789.0:8100/docs.

Under the hood, the deploy command will simply run a script on the Gcore VM to pull our custom image from Docker Hub and run it.

It is valid to run gc-ssf ... init build package publish on a machine that supports the target (for example, has the required IPU and Poplar SDK for the build) and then later deploy the published application from any client with gc-ssf ... deploy. In which case the client from which deploy is issued doesn't strictly need IPUs or the Poplar SDK.

The following diagram summarises the operations of this first example. workflow1

Example 2 - deployment using the generic SSF image

Sometimes our local environment doesn't allow us to build containers, or we just want to experiment quickly. In this second example we won't build an application-specific container image. This means that the following workflow is possible:

(local)-> deploy
(remote)-> init, build, run

This is made possible by storing our model in a repository and using a pre-built SSF image. First we need to set up a remote repository for our model. For example, using a GitHub account we could do:

  cd project_directory && git init
  git add -A
  git commit -m 'First commit'
  git remote add origin git@github.com:your-username/project_directory.git
  git push -u -f origin main

To register the VM SSH key locally we could do:

SSH_KEY=$(cat ssh_key_file)  gc-ssf --add-ssh-key SSH_KEY

If you use a private repo, you will also need to allow your VM to clone from it. To do that you will need to generate a GitHub deploy-key for your repo (or an equivalent access-limited SSH key). Then, pass it with the deploy command using an env variable (for example MY_DEPLOY_KEY) and --add-ssh-key.

Now let's use deploy targeting our git repo:

MY_DEPLOY_KEY=$(cat github_deploy_key) gc-ssf deploy --config 'git@github.com:your-username/project_directory.git|ssf_config.yaml' --port 8100  --deploy-platform Gcore --deploy-gcore-target-address 123.456.789.0 --deploy-gcore-target-username ubuntu --add-ssh-key MY_DEPLOY_KEY
  • Notice this time we are not using --deploy-package, so SSF will deploy the default public generic SSF image. gc-ssf --help can be used to see the default SSF image used for deployment.

Under the hood the deploy command will send and run a script on the Gcore VM. That will pull the public SSF image from Docker Hub, build and run it. The container entry point will clone our repo and issue the three commands init build run.

As in the first example, our API endpoint should be available at
http://123.456.789.0:8100/v1/QA.
You can also test it with Swagger UI under the path http://123.456.789.0:8100/docs.

The following diagram summarises the operations of this second example. workflow1

Note that we only deployed a Docker container on a Gcore VM.
You can still SSH normally into your VM and use the usual Docker commands, for example docker container ls, docker log..., docker stop ....

NOTE: This feature defaults to using the generic Graphcore published SSF image corresponding to your local version of SSF (gc-ssf). If this is not available for your current version of SSF then you can still use the feature by creating your own generic SSF image:

  • Build the generic SSF image for your local version with gc-ssf package --package-tag ssf (this will package SSF without binding an application)
  • Publish the resulting SSF image to your own Docker repository
  • When deploying, add --package-tag <published ssf image> to deploy your published SSF image instead of the default SSF image

See building an SSF image

Discussion: Example 1 vs Example 2

These examples have shown two different ways to deploy on the Gcore platform with SSF. Both are serving your application with the same API, but it's important to underline their differences.

Example 1 - deployment using an application-specific packaged image

This gives you more control:

By packaging your app in advance with SSF, you create your own custom Docker image. Then you can version your images via Docker hub. This method can also have runtime advantages. By building and packaging some runtime-generated files in advance (such as IPUs pre-compiled executables) you can save some precious server startup time.

Example 2 - deployment using the generic SSF image

This is quicker but can have some runtime impact:

By deploying your model with the generic SSF image, you don't need to run Docker locally or worry about the packaging step, and your model can be versioned via git. The server startup time might be impacted since the application build step will be triggered in the deployment environment before the server startup. Of course, you can still include cached files as part of the model repository. But depending on the size of the files you might prefer to package your app in advance and follow Example 1, for instance if you have a very large model to compile.