Termipankki
  1. A
    1. API Blueprint
    2. Addressability
    3. Ajax
    4. Anonymous Function
    5. App Context
  2. B
    1. Blueprint
    2. Business Logic
  3. C
    1. CORS
    2. Callback
    3. Client
    4. Column
    5. Column Attribute
    6. Components Object
      Concept
    7. Connectedness
    8. Container
    9. Control
    10. Converter
      Framework
    11. Cookie
      WWW
    12. Credentials
      Concept
  4. D
    1. DOM
    2. Daemon
    3. Database Schema
    4. Decorator
      Python
  5. E
    1. Element
    2. Entry Point
    3. Environment Variable
  6. F
    1. Fixture
    2. Flask App
    3. Foreign Key
  7. G
    1. Generic Client
  8. H
    1. HTTP Method
    2. HTTP Request
    3. Hash
      Concept
    4. Header
    5. Host Part
    6. Hypermedia
  9. I
    1. Idempotent
    2. Info Object
      Concept
    3. Instance Folder
  10. J
    1. JSON
    2. JSON Schema
  11. L
    1. Link Relation
  12. M
    1. MIME Type
    2. Microservice
    3. Migration
    4. Model Class
  13. N
    1. Namespace
  14. O
    1. ORM
    2. OpenAPI
      OpenAPI
    3. Operation Object
      Concept
  15. P
    1. Pagination
      Concept
    2. Path Parameter
      OpenAPI
    3. Primary Key
    4. Profile
  16. Q
    1. Query
    2. Query Parameter
  17. R
    1. Regular Expression
    2. Request
    3. Request Body
    4. Request Object
    5. Resource
    6. Resource Class
    7. Resource Representation
    8. Response
    9. Response Body
    10. Response Object
    11. Rollback
    12. Routing
    13. Route
      Routing
    14. Row
  18. S
    1. SQL
    2. Serialization
    3. Static Content
    4. Swagger
      tool
    5. System User
  19. T
    1. Table
    2. Test Setup
    3. Test Teardown
  20. U
    1. URI
    2. URL Template
    3. Uniform Interface
    4. Unique Constraint
  21. V
    1. View Function
    2. Virtualenv
  22. W
    1. WSGI
    2. Web API
    3. Web Server
  23. Y
    1. YAML
      Language
Ratkaistu: / tehtävää

Learning Outcomes and Material

This exercise discusses two sides of making an API available: documenting it for other developers, and deploying it to be accessed via the internet.
This first part of this exercise introduces OpenAPI for writing API documentation and a quick glance at tools related to it. You will learn the basic structure of an OpenAPI document, and how to offer API documentation directly from Flask. Documentation will be made for the same version of SensorHub API that was used in the previous testing material.
sensorhub.py

Introduction Lecture

The introduction lecture will only be in-person on the campus. As the contents of this exercise have changed, existing recordings from previous years are no longer applicable.

API Documentation with OpenAPI

Your API is only as good as its documentation. It hardly matters how neat and useful your API is if no one knows how to use it. This is true whether it is a public API, or between different services in a closed architecture. Good API documentation shows all requests that are possible; what parameters, headers, and data are needed to make those requests; and what kinds of responses can be expected. Examples and semantics should be provided for everything.
API documentation is generally done with description languages that are supported by tools for generating the documentation. In this exercise we will be looking into
OpenAPI
and
Swagger
- an API description format and the toolset around it. These tools allow creating and maintaining API documentation in a structured way. Furthermore, various tools can be used to generate parts of the documentation automatically, and reuse
schemas
in the documentation in the API implementation itself. All of these tools make it easier to maintain documentation as the distance between your code and your documentation becomes smaller.
While there are a lot of fancy tools to generate documentation automatically, you first need a proper understanding of the API description format. Without understanding the format it is hard to evaluate when and how to use fancier tools. This material will focus on just that: understanding the OpenAPI specification and being able to write documentation with it.
This material uses OpenAPI version 3.0.4 because at the time of writing, Flasgger does not support the latest 3.1.x versions.

Preparation

There's a couple of pages that are useful to have open in your browser for this material. First there is the obvious OpenAPI specification. It is quite a hefty document and a little hard to get into at first. Nevertheless, after going through this material you should have a basic understanding of how to read it. The second page to keep handy is the Swagger editor where you can paste various examples to see how they are rendered. Also very useful when documenting your own project to ensure your documentation conforms to the OpenAPI specification.
On the Python side, there are a couple of modules that are needed. Primarily we want Flasgger which a Swagger toolkit for Flask. We also have some use for PyYaml. Perform these sorceries and you're all set:
pip install flasgger
pip install pyyaml

Very Short Introduction to YAML

At its core OpenAPI is a specification format that can be written in JSON or YAML. We are going to use
YAML
in the examples for two reasons: it's the format supported by Flasgger, but even more importantly it is much less "noisy" which makes it a whole lot more pleasant to edit. YAML is "a human-friendly data serialization language for all programming languages". It is quite similar to JSON but much like Python, it removes all the syntactic noise from curly braces by separating blocks by indentation. It also strips the need for quotation marks for strings, and most importantly does not give two hoots about extra commas at the last line of object or array. To give a short example, here is a comparison of the same sensor serialized first in JSON
{
    "name": "test-sensor-1",
    "model": "uo-test-sensor",
    "location": {
        "name": "test-site-a",
        "description": "some random university hallway"
    }
}
And the same in YAML:
name: test-sensor-1
model: uo-test-sensor
location:
  name: test-site-a
  description: some random university hallway
The only other thing you really need to know is that items that are in an array together are prefixed with a dash (-) instead of key. A quick example of a list of sensors serialized in JSON:
{
    "items": [
        {
            "name": "test-sensor-1",
            "model": "uo-test-sensor",
            "location": "test-site-a"
        },
        {
            "name": "test-sensor-2",
            "model": "uo-test-sensor",
            "location": null
        }
    ]
}
And again the same in YAML:
items:
- name: test-sensor-1
  model: uo-test-sensor
  location: test-site-a
- name: test-sensor-2
  model: uo-test-sensor
  location: null
Note the lack of differentiation between string values and the null value. In here null is simply a reserved keyword that is converted automatically in parsing. Numbers work similarly. If you absolutely need the string "null" instead, then you can add quotes, writing location: 'null' instead. Finally there are two ways to write longer pieces of text: literal and folded style. Examples below:
multiline: |
  This is a very long description
  that spans a whole two lines
folded: >
  This is another long description
  that will be in a single line
There are a few more detail to YAML but they will not be relevant for this exercise. Feel free to look them up from the specification.

OpenAPI Structure

An OpenAPI document is a rather massively nested structure. In order to get a better grasp of the structure we will start from the top, the OpenAPI Object. This is the document's root level object, and contains a total of 8 possible fields, out of which 3 are required.
The next sections will dive into info, paths and components in more detail. Presented below is the absolute minimum of what must be in OpenAPI document. Absolutely useless as a document, but should give you an idea of the very basics.
openapi: 3.0.4
info:
  title: Absolute Minimal Document
  version: 1.0.0
paths:
  /:
    get:
      responses:
        '200':
          description: An empty root page

Info Object

The info object contains some basic information about your API. This information will be displayed at the top of the documentation. It should give the reader relevant basic information about what the API is for, and - especially for public APIs - terms of service and license information. The fields are quite self-descriptive in the OpenAPI specification, but we've listed them below too.
Below is an example of a completely filled info object.
info:
  title: Sensorhub Example
  version: 0.0.1
  description: |
    This is an API example used in the Programmable Web Project course.
    It stores data about sensors and where they have been deployed.
  termsOfService: http://totally.not.placehold.er/
  contact:
    url: http://totally.not.placehold.er/
    email: pwp-course@lists.oulu.fi
    name: PWP Staff List
  license:
    name: Apache 2.0
    url: https://www.apache.org/licenses/LICENSE-2.0.html

Components Object

The components object is a very handy feature in the
OpenAPI
specification that can drastically reduce the amount of work needed when maintaining documentation. It is essentially a storage for resuable objects that can be referenced from other parts of the documentation (the paths component in particular). Anything that appears more than once in the documentation should be placed here. That way you do not need to update multiple copies of the same thing when making changes to the API.
This object has various fields that categorise the components by their object type. First we will go through all the fields. After that we're going to introduce the component types that are most likely to end up here in their own sub sections.
Out of these we are going to dive into details about schemas, parameters, and requestBodies next.

Schema Object

The schemas field in components will be the new home for all of our schemas. The structure is rather simple: it's just a mapping of schema name to schema object. Schema objects are essentially
JSON schemas
, just written out in
YAML
(that is, in our case - OpenAPI can be written in JSON too.) OpenAPI does adjust the definitions of some properites of JSON schema, as specified in the schema object documentation.
Below is a simple example of how to write the sensor schema we used earlier into a reusable schema component in OpenAPI.
components:
  schemas:
    Sensor:
      type: object
      properties:
        model:
          description: Name of the sensor's model
          type: string
        name:
          description: Sensor's unique name
          type: string
      required:
      - name
      - model
Since we already wrote these schemas once as part of our
model classes
, there's little point in writing them manually again. With a couple of lines in the Python console you can output the results of your json_schema methods into yaml, which you can then copy-paste into your document:
import yaml
from sensorhub import Sensor
print(yaml.dump(Sensor.json_schema()))

Parameter Object

Describing all of the URL variables in our
route
under the parameters field in components is usually a good idea. Even in a small API like the course project, at least the root level variables will be present in a lot of URIs. For instance the sensor variable is already present in at least three routes:
/api/sensors/<sensor>/
/api/sensors/<sensor>/measurements/
/api/sensors/<sensor>/measurements/1/
For the sake of defining each thing in only one place, it seems very natural for parameters to be reusable components. Also, although we didn't talk about
query parameters
much, they can also be described here - useful if you have lots of resources that support filtering or sorting using similar queries. In OpenAPI a parameter is described through a few fields.
Below is an example of the sensor path parameter:
components:
  parameters:
    sensor:
      description: Selected sensor's unique name
      in: path
      name: sensor
      required: true
      schema:
        type: string
As you can see it's quite a few lines just to describe one parameter. All the more reason to define it in one place only. If you look at the parameter specification it also lists quite a few ways to describe parameter style besides schema, but for our purposes schema will be sufficient.

Security Scheme Component

If your API uses authentication, it is quite likely that it is used for more than one resource. Therefore placing security schemes in the reusable components part seems like a smart thing to do. What exactly a security scheme should contain depends on its type. For API keys there are four fields to fill.
A quick example.
components:
  securitySchemes:
    sensorhubKey:
      type: apiKey
      name: Sensorhub-Api-Key
      in: header

Paths Object

The paths object is the meat of your
OpenAPI
documentation. This object needs to document every single path (
route
) in your API. All available methods also need to be documented with enough detail that a client can be implemented based on the documentation. This sounds like a lot of work, and it is, but it's also necessary. Luckily there are ways to reduce the work, but first let's take a look at how to do it completely manually to get an understand of what actually goes into these descriptions.
By itself the paths object is just a mapping of path (route) to path object that describes it. So the keys in this object are just your paths, including any path parameters. Unlike Flask where these are marked with angle braces (e.g. ), in OpenAPI they are marked with curly braces (e.g. {sensor}). So, for instance, the very start of our paths object would be something like:
paths:
  /sensors/:
    ...
  /sensors/{sensor}/:
    ...
Note that these paths are appended to whatever you put in the servers field in the root level servers attribute. Since we put /api there, these paths in full would be the same as our routes: /api/sensors/ and /api/sensors/{sensor}/.

Path Object

A single path is mostly a container for other objects, particularly: parameters and operations. As we discussed earlier, pulling the parameters from the
components object
is a good way to avoid typing the same documentation twice. The operations refer to each of the HTTP methods that are supported by this resource. Before moving on to operations, here is a quick example of referencing the sensor parameter we placed in components:
paths:
  /sensors/{sensor}/:
    parameters:
    - $ref: '#/components/parameters/sensor'
In short, a reference is made with $ref key, using the referred object's address in the documentation as the value. When this is rendered, the contents from the referenced parameter are shown in the documentation.

Operation Object

An operation object contains all the details of a single operation done to a resource. These match to the HTTP methods that are available for the resource. Operations can be roughly divided into two types: ones that return a response body (GET mostly) and ones that don't. Once again OpenAPI documentation for operations lists quite a few fields. We'll narrow the list down a bit.
The responses part is a mapping of status code into a description. Important thing to note is that the codes need to be quoted, as yaml doesn't support starting a name with a number (again, like Python). The contents of a response are discussed next.

Response Object

A response object is an actual representation of what kind of data is to be expected from the API. This also includes all error responses that can be received when the client makes an invalid request. At the very minimum a response object needs to provide a description. For error responses this might be sufficient as well. However for 200 responses the documentation generally also needs to provide at least one example of a response body. This goes into the content field. The content field itself is a mapping of media type to media type objects.
The media type defines the contents of the response through schema and/or example(s). This time we will show how to do that with examples. In our SensorHub API we can have two kinds of sensors returned from the sensor resource: sensors with a location, and without. For completeness' sake it would be best to show an example of both, in which case using the examples field is a good idea. The examples field is a mapping of example name to an example object that usually contains a description, and then finally value where the example itself is placed. Here is a full example all the way from document root to the examples in the sensor resource. It's showing two responses (200 and 404), and two different examples (deployed-sensor and stored-sensor).
paths:
  /sensors/{sensor}/:
    parameters:
    - $ref: '#/components/parameters/sensor'
    get:
      description: Get details of one sensor
      responses:
        '200':
          description: Data of single sensor with extended location info
          content:
            application/json:
              examples:
                deployed-sensor:
                  description: A sensor that has been placed into a location
                  value:
                    name: test-sensor-1
                    model: uo-test-sensor
                    location:
                      name: test-site-a
                      latitude: 123.45
                      longitude: 123.45
                      altitude: 44.51
                      description: in some random university hallway
                stored-sensor:
                  description: A sensor that lies in the storage, currently unused
                  value:
                    name: test-sensor-2
                    model: uo-test-sensor
                    location: null
        '404':
          description: The sensor was not found
Another example that shows using a single example, in the example field. In this case the example content is simply dumped as the field's value. This time the response body is an array, as denoted by the dashes.
paths:
  /sensors/:
    get:
      description: Get the list of managed sensors
      responses:
        '200':
          description: List of sensors with shortened location info
          content:
            application/json:
              example:
              - name: test-sensor-1
                model: uo-test-sensor
                location: test-site-a
              - name: test-sensor-2
                model: uo-test-sensor
                location: null
One final example shows how to include the Location header when documenting 201 responses. This time a headers field is added to the
operation object
while content is omitted (because 201 response is not supposed to have a body).
paths:
  /sensors/:
    post:
      description: Create a new sensor
      responses:
        '201':
          description: The sensor was created successfully
          headers:
            Location: 
              description: URI of the new sensor
              schema: 
                type: string
Here the key in the headers mapping must be identical to the actual header name in the response.

Request Body Object

In POST, PUT, and PATCH operations it's usually helpful to provide an example or schema for what is expected from the request body. Much like a response object, a request body is also made of the description and content fields. As stated earlier, it might be better to put these into components from the start, but we're showing them embedded into the paths themselves. bAs such there isn't much new to show here as the content field should contain a similar media type object as the respective field in responses. Our example here shows the POST method for sensors collection, with both schema (referenced) and an example:
paths:
  /sensors/:
    post:
      description: Create a new sensor
      requestBody:
        description: JSON document that contains basic data for a new sensor
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Sensor'
            example:
              name: new-test-sensor-1
              model: uo-test-sensor-plus

Full Example

You can download the full SensorHub API example below. Feed it to the Swagger editor to see how it renders. In the next section we'll go through how to have it rendered directly from the API server.
sensorhub.yml

Inventory Documenter

The inventory API may be a little bit small at the moment, but that is no excuse to skip documenting it. In this task you'll write some basic OpenAPI documentation for one resource.
Learning goals: Getting familiar with OpenAPI structure and writing documentation using it.

Document Description:
For this task you need to return a valid OpenAPI description that documents the product collection resource. This means you need to fill all mandatory fields in the document. Of course the documentation must also be in line with the implementation itself. The implementation is the same one you were left with after the Resource Locator task. You are free to write whatever for any description fields. More elaborate requirements are below.
Components Section: Your
components object
has to include a schema for products, using the key Product. This schema needs to match the product
model class
(without in_storage which is a relationship).
Paths Section: The paths section has to document the /products/ path. This path support two HTTP methods, both of which need to be documented. For the GET method, your documentation needs to include an example that is an array with at least one valid product (passes through your schema). For the POST method you need to provide a schema via reference, and a valid example for the request body. Do not forget to include all the possible response codes (including the errors). If the response expects a header, it should also be there.

Use the Swagger editor to make sure your OpenAPI document is valid and otherwise looks correct before returning it. The checker will validate your document against the OpenAPI schema, and also check that it conforms with the above requirements.
Varoitus: Et ole kirjautunut sisään. Et voi vastata.

Swagger with Flasgger

Flasgger is a toolkit that brings Swagger to Flask. At the very minimum it can be used for serving the API documentation from the server, with the same rendering that is used in the Swagger editor. It can also do other fancy things, some of which we'll look into, and some will be left to the reader's curiosity.

Basic Setup

When setting up documentation the source YAML files should be put into their own folder. As the first step, let's create a folder called doc, under the folder that contains your app (or the api.py file if you are using a proper project structure). Download the example from above and place it into the doc folder.
In order to enable Flasgger, it needs to be imported, configured, and initialized. Very much like Flask-SQLAlchemy and Flask-Caching earlier. This whole process is shown in the code snippet below.
from flasgger import Swagger, swag_from

app = Flask(__name__, static_folder="static")
# ... SQLAlchemy and Caching setup omitted from here
app.config["SWAGGER"] = {
    "title": "Sensorhub API",
    "openapi": "3.0.4",
    "uiversion": 3,
}
swagger = Swagger(app, template_file="doc/sensorhub.yml")
This is actually everything you need to do to make the documentation viewable. Just point your browser to http://localhost:5000/apidocs/ after starting your Flask test server, and you should see the docs.
NOTE: Flasgger requires all YAML documents to use the start of document marker, three dashes ---.

Modular Swaggering

If holding all of your documentation in a ginormous YAML file sounds like a maintenance nightmare to you, you are probably not alone. Even by itself OpenAPI supports splitting the description into multiple files using file references. If you paid attention you may have noticed that the YAML file was passed to Swagger constructor as template_file. This indicates it is intended to simply be the base, not the whole documentation.
Flasgger allows us to document each view (resource method) in either a separate file, or in the method's docstring. First let's look at using docstrings. In order to document from there, you simply move the contents of the entire
operation object
inside the docstring, and precede it with three dashes. This separates the OpenAPI part from the rest of the docstring. Here is the newly documented GET method for sensors collection
class SensorCollection(Resource):
    
    def get(self):
        """
        This is normal docstring stuff, OpenAPI description starts after dashes.
        ---
        description: Get the list of managed sensors
        responses:
          '200':
            description: List of sensors with shortened location info
            content:
              application/json:
                example:
                - name: test-sensor-1
                  model: uo-test-sensor
                  location: test-site-a
                - name: test-sensor-2
                  model: uo-test-sensor
                  location: null
        """
    
        body = {"items": []}
        for db_sensor in Sensor.query.all():
            item = db_sensor.serialize(short_form=True)
            body["items"].append(item)
            
        return Response(json.dumps(body), 200, mimetype=JSON)
The advantage of doing this is bringing your documentation closer to your code. If you change the view method code then the corresponding API documentation is right there, and you don't need to hunt for it from some other file(s). If you remove the sensors collection path from the tempalte file and load up the documentation, the GET method should still be documented, from this docstring. It will show up as /api/sensors/ however, because Flasgger takes the path directly from your routing.
One slight inconvenience is that you can't define parameters on a resource level anymore, and have to include them in every operation instead. In other words this small part in the sensor resource's documentation
parameters:
- $ref: '#/components/parameters/sensor'
has to be replicated in every method's docstring. References to the components can still be used, as long as those components are defined in the template file. For instance, documented PUT for sensor resource:
class SensorItem(Resource):
    
    def put(self, sensor):
        """
        ---
        description: Replace sensor's basic data with new values
        parameters:
        - $ref: '#/components/parameters/sensor'
        requestBody:
          description: JSON document that contains new basic data for the sensor
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Sensor'
              example:
                name: new-test-sensor-1
                model: uo-test-sensor-plus
        responses:
          '204':
            description: The sensor's attributes were updated successfully
          '400':
            description: The request body was not valid
          '404':
            description: The sensor was not found
          '409':
            description: A sensor with the same name already exists
          '415':
            description: Wrong media type was used
        """
    
        if not request.json:
            raise UnsupportedMediaType

        try:
            validate(request.json, Sensor.json_schema())
        except ValidationError as e:
            raise BadRequest(description=str(e))

        sensor.deserialize(request.json)
        try:
            db.session.add(sensor)
            db.session.commit()
        except IntegrityError:
            raise Conflict(
                "Sensor with name '{name}' already exists.".format(
                    **request.json
                )
            )
        
        return Response(status=204)
Another option is to use separate files for each view, and the swag_from
decorator
. In that case you would put each
operation object
into its own YAML file and place it somewhere like doc/sensorcollection/get.yml. Then you'd simply decorate the methods like this:
@swag_from("doc/sensorcollection/get.yml")
class SensorCollection(Resource):
    
    def get(self):
        ...
Or if you follow the correct naming convention for your folder structure, you can also have Flasgger do all of this for you without explicitly using the swag_from decorator. Specifically, if your file paths follow this pattern:
/{resource_class_name}/{method}.yml
then you can add "doc_dir" to Flasgger's configuration, and it will look for these documentation files automatically. Note that your filenames must have the .yml extension for autodiscover to find them, .yaml doesn't work. One addition to the config is all you need.
app.config["SWAGGER"] = {
    "title": "Sensorhub API",
    "openapi": "3.0.4",
    "uiversion": 3,
    "doc_dir": "./doc",
}
Ultimately how you manage your documentation files is up to you. With this section you now have three options to choose from, and with further exploration you can find more. However it needs to be noted that currently Flasgger does not follow file references in YAML files, so you can't go and split your template file into smaller pieces. Still, managing all your reusable components in the template file and the view documentations elsewhere already provides some nice structure.

Deploying Flask Applications

Deployment of web applications is a major topic in today's internet environment. Large numbers of simultaneous users cause pressure to create applications that can be scaled via multiple vectors. Web applications span across multiple servers, and are increasingly often composed from
microservices
- small independent apps that each take care of one facet of the larger system. Our single process test server we've been using so far isn't exactly going to cut it anymore. In this section we will take a brief look into how you can turn your flask app into a serviceable deployment that can handle at least a somewhat respectable amount of client connections.
There are two immediate objectives for our efforts: first, we need to add parallel processing for our app; second, it needs to be managed automatically by the system. After these steps we also need to make it available.
Some amount of parallelism is always desired in web applications because they tend to spend a lot of their time on I/O operations like reading/writing to sockets, and accessing the database. Using multiple processes or threads allows the app to perform more efficiently. As a microframework Flask doesn't come with anything like this out of the box (neither does Django for that matter). Luckily there are rather straightforward solutions to this problem that are applicable to all kinds of Python web frameworks, not just Flask.
The benefit of managing something automatically should be rather obvious. A web application is not very useful if it closes when you close the terminal that's running it e.g. by logging out of a server. Similarly it's rather bothersome if it needs to be manually restarted when it crashes or the server gets rebooted. The goal is usually achieved in one of the two ways: either by using
daemon
processes in Linux on more traditional server deployments, or by using
container
orchestration. Well, it's really just one way because containers are also managed by daemon processes, but they are often configured via cloud application platforms like Kubernetes.
Finally, in order to make an application available, it needs to be served from a public facing interface, usually the HTTP port 80 or the HTTPS port 443. This is typically not done directly by the application itself as there are multiple problems involved for both performance and security. In typical deployments, applications are sitting comfortably behind web servers that forward client requests to them and take care of the first line of security.
Please be aware that some parts of this material can only be done in UNIX based systems. If you don't have one, it might be a good time to learn how to roll up a virtual Linux machine by using Oracle VM Virtualbox. All of the instructions are also only written for UNIX based systems. While we assume readers have very little Linux experience, we're not going to explain every single command used. If you want to know more, look them up.
One of the tasks also requires you to use a VM in the cPouta cloud that we have set up for the course, but as these are a limited resource, please use a local VM first to understand the process, and then repeat the steps to your VM in cPouta.
We have created a VM (VirtualBox) for you to test. It runs Lubuntu 24.04. You can download from here or if you are in University network you can access directly from:\\kaappi\Virtuaalikoneet$\VMware\PWP2025\PWP.ova. The user is pwp and the password is pwp
If you are already familiar with deploying Python applications, it should be fine to skip most of the sections.

Test Application

For all of these examples we are going to use the sensorhub app created in exercise 2. This allows us to dive a little bit deeper than most basic tutorials that simply install a web app that says hello, with no database connections or API keys etc to set up. Whenever you need to set up the application, use these lines. Working inside a virtual environment owned by your login user is assumed, and your current working directory should be the virtual envinronment's root.
Before starting our magic, be sure that you have the necessary libraries installed in your system. You can download the following requirements.txt file and run the command pip pip install -r requirements.txt
requirements.txt
flask-restful
Flask-SQLAlchemy
flask-caching
jsonschema

And now you are ready to download and setup the application
git clone https://github.com/UniOulu-Ubicomp-Programming-Courses/pwp-senshorhub-ex-2.git sensorhub
cd sensorhub
flask --app=sensorhub init-db
flask --app=sensorhub testgen
flask --app=sensorhub masterkey
Copy the master key somewhere if you want to actually be able to access anything in the API afterward. It may also be useful to know that when using virtual environments in Linux, the Flask
instance folder
will be /path/to/your/venv/var/sensorhub-instance by default.

(Green) Unicorns Are Real

First we are going to introduce Gunicorn. It's a Python
WSGI
HTTP server that runs processes using a pre-fork model. And because that's quite the word salad, it's actually just faster to show how to do it in practice first. Let's assume we have set up sensorhub following the instructions above, and we are working in an active virtual environment. From here it takes two entire lines to install Gunicorn and have it run the sensorhub app:
python -m pip install gunicorn
gunicorn -w 3 "sensorhub:create_app()"
Congrats, you are now the proud owner of 3 sensorhub processes, as determined by the optional argument -w 3 in the command above. The mandatory argument ("sensorhub:create_app()" in our case) identifies the callable that HTTP requests are passed to - for Flask it is the Flask application object. The module where the callable is found is given as a Python import path, same way you would give it when running something with python -m or importing it into another module. In the case of the project we're working, the Flask application object is created by the create_app function, and therefore we need to define a call to the function in order to obtain it. For single file applications where you just define app into a variable, you would use sensorhub:app.
These processes are managed by Gunicorn. The "pre-fork model" part of the description means that the worker processes are spawned in advance at launch, and HTTP request are handled by the workers. This is in opposition to spawning workers or threads for incoming requests during runtime.
If you check Gunicorn's help you can see that it has about 60 other optional arguments so there's definitely a lot more to using it than what we just did above. A vast majority of these options exist to support different deployment configurations, or to optimize the worker processes further. We don't actually have enough data about the performance of our app to know what should be optimized about its deployment, so we will leave these untouched. We're just going to use the initial recommendation of using 2 workers per processor core + 1 worker.

Path 1: Management With Supervisor

Linux is required from here on. Also from this point onward it is recommended that you work on a virtual machine that you can just throw away once you're done. We are about to install things that get automatically started, and cleaning up everything afterward is just a pain in general.
The next piece of the puzzle is Supervisor. As per its documentation, "Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems." It is generally used to manage processes that do not come with their own daemonization. Most Python scripts fall under this category. Supervisor allows its managed processes to be automatically started and restarted, and also offers a centralized way for users to manage those processes - including the ability to allow non-admin users to start and restart processes started by root (we'll get back to why this is imporant later).

Preparing Your App for Daemon Possession

When moving a process to be controlled by Supervisor (or any other management system), there's usually two things that need to be decided:
  1. How are configuration parameters passed to the process
  2. How to write logs
If your process needs to read something from
environment variables
, these need to be set for the environment it's running in. This also applies to activating the virtual environment. A straightforward way to achieve both of these goals with one solution is to write a small shell script that sets the necessary environment variables and activates the virtual environment. The best exact way to manage configuration depends on what framework is used, and whether the configuration file is included in the project's git repository or not.
Since Flask has a built-in way to read configuration from a file in its instance folder, a separate configuration file is the recommended way. As it's in the instance folder, it will not go into the project's repository. This means it's suitable for storing secrets too, as long as the file's ownership and permissions are properly set so that it's only readable by the
system user
that runs your application. The downside is that your project can't ship with a default configuration file but this is rather easily solved by implementing a terminal command to generate a default configuration file.
However, just in case you are working with a rather simple single file application, we're going to use environment variables in this example to show another way to do it. When doing so it's important to only read them from a secure file at process launch, and also immediately remove any variables that contain secrets after reading them into program memory. It should be stressed that regardless of how carefully you handle environment variables, properly secured configuration files are still a better approach if you have a way to manage them.
In this example the chosen approach is to create a second script into the virtual environments bin folder called postactivate. This will be invoked exactly like the activate script so it's convenient to use in both development and deployment. The only difference is that if there are any secrets in this file, its owner should be set to the system user that runs the process, and its permissions to 400 when deploying to production. This user should also be the owner of the local git repository. Since until now you have probably just used your login user for everything, we'll start with a full list of steps. Let's assume our project will be in /opt/sensorhub/sensorhub and its virtual environment in /opt/sensorhub/venv.
  1. Create the system user, e.g. sensorhub
    • sudo useradd --system sensorhub
  2. (Development only) add your login user to the sensorhub group. You would need to do this in order to be able to follow the next instructions (even if you are in production, you might change groups later)
    • sudo usermod -aG sensorhub $USER
  3. So the group change takes effect try to force the group change in current session:
    • exec su -p $USER
    • id
    • If your user does not show the group sensorhub you would need to logout and log back in your linux session.
  4. Create the sensorhub folder and grant ownership to sensorhub user, drop all privileges from other users
    • sudo mkdir /opt/sensorhub
    • sudo chown sensorhub:sensorhub /opt/sensorhub
    • sudo chmod -R o-rwx /opt/sensorhub
  5. Create a virtual environment with the sensorhub user
    • sudo -u sensorhub python3 -m venv /opt/sensorhub/venv
  6. Clone the repository and perform database initialization etc. with the sensorhub user.
    • sudo -u sensorhub git clone https://github.com/UniOulu-Ubicomp-Programming-Courses/pwp-senshorhub-ex-2.git /opt/sensorhub/sensorhub
  7. Create the postactive file
    • sudo -u sensorhub touch /opt/sensorhub/venv/bin/postactivate
  8. (Production only) change file permissions to owner only
    • sudo chmod 600 /opt/sensorhub/venv/bin/postactivate
  9. Add any required environment variables to the file as. For this example let's set the number of workers in this file by adding this line to it:
    • export GUNICORN_WORKERS=3
  10. Activate the virtual environment for your user and add environment variables
    • source /opt/sensorhub/venv/bin/activate
    • source /opt/sensorhub/venv/bin/postactivate
  11. Install packages with pip as the sensorhub user while passing your environment to the process (otherwise it will try to run with system python instead)
    • cd /opt/sensorhub/sensorhub
    • sudo -u sensorhub -E env PATH=$PATH python -m pip install -r requirements.txt
  12. Set up databse and masterkey
    • sudo -u sensorhub -E env PATH=$PATH flask --app=sensorhub init-db
    • sudo -u sensorhub -E env PATH=$PATH flask --app=sensorhub testgen
    • sudo -u sensorhub -E env PATH=$PATH flask --app=sensorhub masterkey
  13. Run Gunicorn as sensorhub user
    • cd /opt/sensorhub/sensorhub
    • sudo -u sensorhub -E env PATH=$PATH gunicorn -w $GUNICORN_WORKERS "sensorhub:create_app()"
If this runs successfully without permission errors, your application setup should be ready to be run with Supervisor as well. To check it you can send a HTTP GET request to the sensors resource:
curl http://127.0.0.1:8000/api/. It should return and answer with name and version of the api.
While this was a lot of steps, it's a solid crash course in putting Python code into servers in general. Obviously if you would need to do this for multiple servers, at that point it's best to look into either containerization, or deployment automation with something like Ansible. If you only need to deal with one server, doing this process once isn't too bad since updates really only need you to pull your code, install your project with pip, and then restart the processes (with Supervisor).
The last thing we need to do is to write the shell script that allows Supervisor to run this app. A good place to put this is the scripts folder inside your virtual environment. We'll start by creating the folder and creating a runnable script file
Here are the contents of the file for now. Copy them in there with your preferred text editor.
#!/bin/sh

cd /opt/sensorhub/sensorhub
. /opt/sensorhub/venv/bin/activate
. /opt/sensorhub/venv/bin/postactivate

exec gunicorn -w $GUNICORN_WORKERS "sensorhub:create_app()"
If you can now run your app with sudo -u sensorhub /opt/sensorhub/venv/scripts/start_gunicorn everything should be ready for the next step.

Commence Supervision

Compared to the previous step, the matter of actually running your process via Supervisor is a lot more straightforward. Supervisor can most probably be installed via your operating system's package manager, i.e. sudo apt install supervisor on systems that use APT. In order for Supervisor to manage your program, you will need to include it in Supervisor's configuration. This is usually best done by placing a .conf file in /etc/supervisord/conf.d/. The exact location and naming convention can be different. For instance on RHEL / CentOS it would be a .ini file inside /etc/supervisord.d.
To explain very briefly in case you've never seen this: this mechanism of storing custom configurations for programs installed by the package manager as fragments in a .d directory inside the program's configuration folder instead of editing the main configuration is intended to make your life easier. The default configuration file is usually written by the package manager, and if there is an update to it, any custom changes in the main configuration file would be in conflict. If all custom configuration is in separate files instead, they will be untouched by the package manager, and always simply be applied after the default configuration has been loaded. As they are loaded after, they can also include overrides to the default configuration.
The .conf files used by Supervisor follow a relatively common configuration file syntax where sections are marked by square braces, and each configuration option is just like a Python variable assignment with =. For Supervisor specifically, configuration sections that specify a program for it to manage must follow the syntax [program:programname]. With that in mind, in order to have supervisor run Gunicorn with the script we wrote in the previous section, we can create the .conf file with sudo touch /etc/supervisor/conf.d/sensorhub.conf and then drop the following contents into it.
[program:sensorhub]
command = /opt/sensorhub/venv/scripts/start_gunicorn
autostart = true
autorestart = true
user = sensorhub

stdout_logfile = /opt/sensorhub/logs/gunicorn.log
redirect_stderr = true
The last two lines are for writing Gunicorn's (and by proxy our application's) output and error messages into a log file. For this simple example we're using a folder that's inside the sensorhub's folder. Most logs in Linux would generally go to /var/log but our sensorhub user doesn't have write access there for now, and this way when you are done playing with this test deployment you will have less places to clean up. The log directory also needs to be created:
sudo -u sensorhub mkdir /opt/sensorhub/logs
With this we should be ready to reload Supervisor:
sudo systemctl reload supervisor
You can check the status of your process from supervisorctl, and manage it with commands like start, restart, and stop.
$ sudo supervisorctl
sensorhub                        RUNNING   pid 327471, uptime 0:00:24
supervisor> restart sensorhub
sensorhub: stopped
sensorhub: started
supervisor>
Again an HTTP GET request to /api/ should return the information of our api:
curl http://127.0.0.1:8000/api/

Path 2: Docker Deployment

This section offers an alternative to using Supervisor: using Docker to run your application in a container instead. Docker also allows automatic starting of containers, so it can fulfill a similar role. Containers are akin to virtual machines, but they only virtualize the application layer, i.e. allowing each application to have its libraries and binaries indepedently of the main operating system. In this sense they are also very similar to Python virtual environments but more isolated than that. Unlike a full-blown VM, container typically runs just one application, and if multiple pieces need to collaborate such as NGINX and Gunicorn in the examples to come, these would be placed in separate containers inside the same pod (see later section).
Whether you went through the Supervisor tutorial above or not, you should read this section if you're not familiar with Docker because it's relevant for Rahti 2 deployment later.
First matter at hand is to install Docker, which is simple enough: sudo apt install docker-buildx. This will install the docker build package as well as the docker packages. If your distribution do not contain those packages you can try to install from official docker website
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Running Python applications is also rather simple with Docker using existing official Python images from Docker Hub. These images come with everything you need to run a Python application, and all your dockerfile needs to do is install any required packages, and define a command to run the application. Whenever possible, the Alpine image variant should be used. It has the smallest image size, but more limitations to packages that can be installed. If you can't build your project with it, move up to the slim variant, and finally to the default.

Container Considerations

Because the container is running in its own, well, container, it is unable to communicate directly with the host machine. This means that when running the container, any connections between the host and the container need to be defined. These need to be defined from the host side when when launching the container, because otherwise the container itself would not be easily transferrable to different hosts. Any web application will at least need a port opened so it can receive HTTP requests. This is done my mapping a port from the host to the container so that connections to the host port will be forwarded to the application.
Another matter when using containers is the use of shared volumes, since again by default the container has its own file system and cannot see the host's file system. Everything inside a container is deleted when it's destroyed, and as containers are expected to be expendable, it naturally follows that any sort of data that is supposed to persist must therefore live on the host system, or an entirely external location such as a database server. Since we have been using SQLite so far, this part of the example shows how to share an instance folder on the host with a container that is running the app. So we will be initializing and populating the database on the host machine first, and then run our container so that it has a database ready for use.

Creating Images From Dockerfiles

In order for Docker to be able to run an app, a dockerfile needs to be defined. The sensorhub example comes with the following dockerfile ready to go:
Dockerfile
FROM python:3.13-alpine
WORKDIR /opt/sensorhub
COPY . .
RUN pip install -r requirements.txt 
CMD ["gunicorn", "-w", "3", "-b", "0.0.0.0", "sensorhub:create_app()"]

These files are relatively simple to understand but let's go through each instruction. These instructions are run in order.
If you have more specific needs for your image, you can find more information about Dockerfiles from the Dockerfile reference. Looking into the intricacies of how CMD and ENTRYPOINT differ and interact with each other is highly recommended, but not important for going through examples in this material.
Now that we have some understanding of what this dockerfile does, we can build it.
sudo docker build -t sensorhub .
This command will build the contents of the specified path (.) into an image named sensorhub, which will be stored in the local Docker image storage.
You can always check which are the docker images available in your computer using
sudo docker images
If you want to remove a docker image you can always do:
sudo docker rmi
But at this stage, let's not remove it.

Running Containers

After building the image you can launch containers from it. The first run example is intended for testing. It keeps the process running inside your terminal so that you can easily see its output and interact with it (e.g. to stop it with Ctrl + c) (-it argument), and it will also destroy the container when it exits (--rm argument). As discussed earlier, the run command here also needs to define port mapping (-p argument, publish), and shared volume so that the container can access an existing instance folder -v argument, volume).
docker run -it --rm -p 8000:8000 -v /your/venv/var/sensorhub-instance:/usr/src/app/instance \
           --name sensorhub-testrun sensorhub
Please not that we are mapping the instance folder in our machine with the instance folder in the docker instance, so we always have access to the database.
When you run the container successfully, it will mostly look the same as running Gunicorn directly.
[2025-02-18 13:09:28 +0000] [1] [INFO] Starting gunicorn 23.0.0
[2025-02-18 13:09:28 +0000] [1] [INFO] Listening at: http://0.0.0.0:8000 (1)
[2025-02-18 13:09:28 +0000] [1] [INFO] Using worker: sync
[2025-02-18 13:09:28 +0000] [7] [INFO] Booting worker with pid: 7
[2025-02-18 13:09:28 +0000] [8] [INFO] Booting worker with pid: 8
[2025-02-18 13:09:29 +0000] [9] [INFO] Booting worker with pid: 9
If you have errors indicacating that address is already in use, you need to be sure that gunicorn or other service is using the port 8000
sudo lsof -i :8000
If you are running gunicorn, it might be that it could be running from previous tasks. You can try to kill the process. If it is running via supervisor, you would need to cancel the process in supervisor:
sudo supervisorctl stop sensorhub
The 0.0.0.0 as listen address means that Gunicorn will accept connections that have any IP as the host header. The port mapping -p 8000:8000 means that if you point your browser to localhost:8000 it will be forwarded to your container, and you should see the response from your app. Also to check that the shared volume was correctly set up, you can try to get the /api/sensors/ URI. This should give you a permission error because you didn't send an API key. If it gives you an internal server error instead, the instance folder is not being correctly shared (or you forgot to create and populate a database there).
If you would need to open a shell in the docker instance to call certain commands, try sudo docker exec -it sensorhub-testrun sh
As for our last trick, we'll modify the run command so that the container will run in the background instead, and be restarted when the Docker daemon is restarted.
docker run -d -p 8000:8000 --restart unless-stopped \
           -v /your/venv/var/sensorhub-instance:/usr/src/app/instance \
           --name sensorhub-testrun sensorhub
You can stop this container with docker stop sensorhub-testrun. If stopped this way, or if an error is encountered, this container will not restart. If you want to restart on errors, you can use "always" as the restart policy instead of "unless-stopped". This container is no longer automatically removed so if you want to remove it, use docker rm sensorhub-testrun.

Start Your Engine X

This step can be done after either of the above two paths with no differences.
In our current state, the application is still serving requests from port 8000 which is the Gunicorn default. Now, of course you could open this port to the world from your server's firewall, but there will probably be firewalls above your firewall that would still block it, because most of the time web servers are only expected to handle connections to the HTTP and HTTPS ports, 80 and 443, respectively. So, what's stopping us from just binding Gunicorn to these ports? Well, UNIX is. Non-root users are not allowed to listen on low-numbered ports. Running your application as root is also a terrible idea, in case you were wondering. Currently if there is a vulnerability in your app, the most damage it can do is to itself because the sensorhub user just doesn't have a whole lot of privileges outside its own directory.
There is another reason as well why you should never serve your app directly by Gunicorn:
static files
. Not everything that gets accessed from your server is a response generated by application code; sometimes it also serves files that are simply read from disk and sent to the client as-is. In this case serving these files through static views in your application causes unnecessary overhead. The recommended setup is to have an
HTTP web server
sit between the wide world and your application. This server's task is to figure out whether a static file is being requested (usually identified by the URL), and if not, forward the request to Gunicorn workers. One other thing is also that if you need to use HTTPS for encryption, it can be handled in the web server, and neither Gunicorn or your application need to bother with it.
The two most commonly used HTTP web servers in Linux are Apache and NGINX (source: netcraft). Out of the two Apache has been in steady decline, so we have chosen to use NGINX for this example. It is also somewhat friendlier to work with. The process is more or less similar to what we had to do with Supervisor: install the server, and create a configuration file for our app. So, install it with your package manager, and then figure out where the configurations should go. Where this was written, they go to /etc/nginx/sites-available. Configuration files are made up of directives. Simple directives are written just as directive name and then its arguments separated by spaces, whereas complex directives that can contain other directives are enclosed within curly braces.
The configuration we're using is mostly just taken from the example in Gunicorn's documentation. The main difference being that the example there is a full configuration file, and we are only going to use its server directives with the default NGINX configuration, and put them into a separate file as described earlier for Supervisor. The configuration file is below, and explanations have been added as comments to the file itself. Overall to use NGINX efficiently in big deployments there would be a lot to learn, but for our current purposes this simple configuration is good enough. Save the contents of this file to /etc/nginx/sites-available/sensorhub.
sensorhub
server {
    # Make this the default server that is used to process a request
    # where the HOST header does not match the server's server_name
    # directive. Used to prevent host spoofing.
    # Immediately closes the connection.

    listen 80 default_server;
    return 444;
}

upstream app_server {
    # This directive defines how requests are passed to Gunicorn.

    # fail_timeout=0 means we always retry an upstream even if it failed
    # to return a good HTTP response

    # For UNIX domain socket setups, i.e. your app runs on the same
    # machine with NGINX
    # server unix:/tmp/gunicorn.sock fail_timeout=0;

    # For a TCP configuration use IP address instead. Example use
    # would be running NGINX and your app in two different containers
    server 127.0.0.1:8000 fail_timeout=0;
}

server {
    # Use 'listen 80 deferred;' for Linux.
    # This prevents the process from being woken up until there's a
    # packet with real data for it to process.
    listen 80 deferred;
    client_max_body_size 4G;

    # Setting localhost as the server name for now, change this to
    # your hostname accordingly, or IP address if running on a server
    # without a hostname.
    server_name localhost;

    keepalive_timeout 5;

    # The path from which static files are served.
    # We don't have any in the app right now, but it would likely be
    # this for now.
    root /opt/sensorapp/sensorapp/static/;

    location / {
        # Checks for static file. If not found, use the proxy_to_app
        # location to process the request.
        try_files $uri @proxy_to_app;
    }

    location @proxy_to_app {
        # These are headers that allow our app to know more about
        # the original request that was sent.
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Host $http_host;
        # we don't want nginx trying to do something clever with
        # redirects, we set the Host: header above already.
        proxy_redirect off;

        # This line passes the request on to the address defined
        # by the upstream directive above.
        proxy_pass http://app_server;
    }

    # Serve a custom error 500 page from the app's static folder.
    # Commented out since we don't actually have one.
    # error_page 500 502 503 504 /500.html;
    # location = /500.html {
    #    root /opt/sensorapp/sensorapp/static/;
    # }
}

It's very likely that NGINX was automatically started (and added to autostart as well) when it was installed. It's currently serving its default server configuration from the /etc/nginx/sites-available/default file. The way in which site configurations are managed, is usually by using symbolic links from /etc/nginx/sites-enabled/ to /etc/nginx/sites-available/. This way multiple configuration files can exist in sites-available, and the symbolic links can be used to choose which ones are actually in use. In order to make our new sensorhub configuration the main one, run these two commands, and then reload NGINX configuration with the third one
sudo ln -s /etc/nginx/sites-available/sensorhub /etc/nginx/sites-enabled/sensorhub
sudo rm /etc/nginx/sites-enabled/default
sudo systemctl reload nginx
If Supervisor or Docker is still running your app, checking e.g. localhost/api/sensors/ with your browser should now show a response from the app.

Deploying on VM

This task does not ask you to do anything new, it's just here to check that you were able to follow the steps above correctly.
Learning goals: Being able to set up a Flask application behind NGINX on a Linux server.

Task:
In order to complete this task, you need to execute the steps above in a virtual machine in the cPouta cloud. If you haven't done so already, follow this guide to get access to a VM. You can choose how to do it, but at the end your VM must have your server listening on port 80 and be able to produce responses from the sensorhub app, particularly the /api/sensors/ route.
Please populate your database using the test data generation command line utility, and also generate the API master key.
Remember: the Server name directive in your NGINX configuration must use the central server's IP, not your own. Server name is the host name that is in the URLs requested by the client.

Answer Format:
To answer this task, write a JSON document that has two fields: host and apikey, where host is the local network IP address of your VM, and apikey is the master key you generated for your app.
Varoitus: Et ole kirjautunut sisään. Et voi vastata.

Deployment Options for This Course

Since you will be asked to make your API available for your project, we need to provide you with options for doing so. The first option we are offering you is to use a virtual machine from the cPouta cloud, and do the deployment as described above. The second option is to use the Rahti 2 service where you can run containers.

Deploying in Rahti 2

The instructions above allow you to manually install your app and NGINX on a Linux server. This is useful, fundamental information, and it also allows you to understand a little bit more about what might be taking place underneath the surface when using a cloud application platform. However as you probably know already, using cloud platforms is how things are usually done when scaling and ease of deployment are required.
In order to run the NGINX -> Gunicorn -> Flask app chain in Rahti, we need to put all components in the same pod. When containers are running inside the same pod, they will have an internal network that allows them to communicate with each other. This of course means that we now need to stuff NGINX into a container as well. More specifically, we need to stuff it in a container in a way that it works with OpenShift. Normally NGINX starts as root and then drops privileges to another user, allowing it to listen on privileged port but not actually giving root privileges to anything that runs on it. OpenShift does not allow running anything as root. This means that the user NGINX is run as must be the same the whole time, and that it cannot listen on privileged ports.

NGINX in a Box

We've provided the necessary files in a second Github repository. They've been forked from CSC's tutorial project and fitted for our example. We'll only cover them briefly in this section. For now there's two files to care about: the dockerfile and the configuration file. The latter is almost the same as the file we showed you earlier. The dockerfile is shown below.
Dockerfile
FROM nginx:alpine

# support running as arbitrary user which belogs to the root group
RUN chmod g+rwx /var/cache/nginx /var/run /var/log/nginx && \
    chown nginx:root /var/cache/nginx /var/run /var/log/nginx && \
    # make it possible for nginx user to replace default.conf when launched
    chmod -R g+rwX /etc/nginx/conf.d && \
    chown -R nginx:root /etc/nginx/conf.d && \
    # comment user directive as master process is run as user in OpenShift
    # comment server_names_hash_bucket_size if set (sould not be in current image)
    # then add it and set to 128 to support the long hostnames given by Rahti
    sed -i.bak -e 's/^user/#user/' \
               -e 's/^(\s*)server_names_hash_bucket_size/\1# server_names_hash_bucket_size/' \
               -e 's/http {/http {\n    server_names_hash_bucket_size 128;/' \
               /etc/nginx/nginx.conf

COPY default.conf.template /etc/nginx/templates/

WORKDIR /usr/share/nginx/html/
EXPOSE 8080

USER nginx:root

The most notable thing about this file is the rather long RUN instruction. Chaining command together into one RUN instruction with the AND (&&) operator is a practice to reduce image size. This is because every instruction creates a new layer in the image. Layers are related to build efficiency and you can read more about them in Docker's documentaton. Essentially nothing in our RUN instruction is worth saving as a layer, so it's better to do everything at once. The instruction itself makes certain directories in the file system accessible to users in the root group, and finally does some witchcraft with sed to the nginx.conf file:
  1. Comment out the user directive - processes started by non-root users cannot change uid.
  2. Comment out server_names_hash_bucket_size directive just in case - it's not currently defined in the images's default configuration, but you never know if it gets added back in the future.
  3. Add the same directive and set it to a higher value than the default. Rahti's hostnames are too long for the default size.
The other part is the configuration, which is now a template. NGINX configuration does not normally read environment variables but if you make templates and put them in /etc/nginx/templates instead, configuration files will be generated from these templates instead, and environment variables will be substituted with their values in the process. This is necessary because we need to be able to configure the server's server_name without modifying the image. We also changed the listen port to 8080 because we cannot listen to 80 without root privileges.
The root path was changed but it doesn't really matter right now because we're not serving any static files. Just take note of this because earlier NGINX was running with access to the file system where the Sensorhub's static files were. Now they will be in another container, which means they are no longer accessible directly, and some amount of sorcery is required to make the static files accessible again. We are not covering this sorcery here.
Note also that instead of doing things with sites-available and sites-enabled, we're just copying the configuration over the default.conf file. This NGINX container is only for serving a single application so bothering with elaborate configuration management is overkill, and would just add unnecessary instructions to the dockerfile. You can just build the image and do a test run now. Note that the HOSTNAME environment variable needs to be set when running the image now.
sudo docker build -t sensorhub-nginx .
sudo docker run --rm -p 8080:8080 -e HOSTNAME='localhost' sensorhub-nginx
If you try to visit localhost:8080 in your browser, you should be greeted by 502 Bad Gateway because Gunicorn is in fact not running.

Flask in the Same Box

To avoid creating duplicate images to public repositories, the examples below use pre-built images of Sensorhub and its NGINX companion that we have uploaded to Docker Hub. We could technically also instruct you how to manage a local image repository instead but this tutorial already has quite a lot of stuff in it.
In this case, since we are deploying to a cluster, and at some point upload it to Rahti 2, the user running gunicorn cannot be root. Hence, we have modified the dockerfile, so everything goes to /opt/sensorhub folder and run by user sensorhub. You do not need this file to follow this explanation, because, remember, we are giving the images. But this would be necessary if you would like to create your own image. Ahh, one last thing. Be sure that your repo does not contain the instance folder. Otherwise, image generation will fail.
openshift
FROM python:3.13-alpine
WORKDIR /opt/sensorhub
COPY . .
RUN pip install -r requirements.txt && \
    mkdir /opt/sensorhub/instance && \
    chgrp -R root /opt/sensorhub && \
    chmod -R g=u /opt/sensorhub
CMD ["gunicorn", "-w", "3", "-b", "0.0.0.0", "sensorhub:create_app()"]

In order to get more than one container to run in the same pod, we need to learn some very basics of container orchestration. The Deployment.yaml file in the repository (check the templates folder) is the final product that allows the pod to be run in Rahti 2. In order to better understand the process, we're going to start from something slightly simpler that you can run on your own machine. This example uses Kubernetes for orchestration, as it is the base of OpenShift and most things will remain the same when moving to Rahti. You can also check Docker's brief orchestration guide for a quick start.
In this case, since we are deploying to a cluster, and at some point upload it to Rahti 2, the user running gunicorn cannot be root. Hence, we have modified the dockerfile, so everything goes to /opt/sensorhub folder and run by user sensorhub. You do not need this file to follow this explanation, because, remember, we are giving the images. But this would be necessary if you would like to create your own image.
If you feel like you already know this stuff, you can skip ahead to where we start deploying things in Rahti.
Before we can run anything, we need access to a Kubernetes cluster. For now, we're going to use Kind to run a local development Kubernetes cluster. See Kind's quick start guide for installation instructions, or use the following spells to grab the binary.
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.27.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
You would need to install also kubectl. If you are using ubuntu:
snap install kubectl --classic
You will also need the following small configuration file for starting your cluster. Otherwise you cannot access things running in the cluster without explicit port forwarding. The port defined here (30001) must match the service's nodePort in the service configuration later.
test-cluster.yml
apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
- role: control-plane
  extraPortMappings:
  - containerPort: 30001
    hostPort: 30001
- role: worker

sudo kind create cluster --config test-cluster.yml
Next we need to define a deployment and a service. These can be included in the same file by defining two documents inside it using --- as a separator (indicates "start of document" in YAML). The file below has been adopted from Docker's Kubernetes deployment tutorial. The biggest modification is running two containers in the pod instead of just one. As stated earlier, the images are pulled from Docker Hub.
sensorhub-deployment.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: testing-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 32Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sensorhub-test-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sensorhub
  template:
    metadata:
      labels:
        app: sensorhub
    spec:
      containers:
        - name: sensorhub-testrun
          image: docker.io/mioja/sensorhub:latest
          imagePullPolicy: Always
          ports:
          - containerPort: 8000
            protocol: TCP
        - name: nginx-testrun
          image: docker.io/mioja/sensorhub-nginx:latest
          imagePullPolicy: Always
          ports:
          - containerPort: 8080
            protocol: TCP
          env:
          - name: HOSTNAME
            value: dis.server.has.a.really.long.name.just.so.you.know.fi
---
apiVersion: v1
kind: Service
metadata:
  name: sensorhub-entrypoint
  namespace: default
spec:
  type: NodePort
  selector:
    app: sensorhub
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 30001

Most of the file is just the minimum definition to make the deployment run and take connections. For instance, labels must be defined, and the service part must have a selector for the labels. If you were to operate a massive real-life deployment then these would have more meaning than just "having to exist", but that is a topic for another course. The template part is where the actual containers are defined. These are also very simple in our example, the fields are briefly explained below.
The service part of this file is what routes traffic to the pod(s). In this example we're using the NodePort type as that allows connecting to the pod from localhost for testing. The ports listed here are
With this file you can run the whole thing in your local cluster. But since we are working in local machine, you should modify the HOSTNAME environments variable, and subsitute it by localhost. After this change we can run the cluster with:
sudo kubectl apply -f sensorhub-deployment.yml
After a short while your pod should be up. If you hit localhost:30001 in your browser you should get a response from the Flask application. Unfortunately it doesn't have a database set up so all it can give you is internal server errors or not founds, depending which URI you hit. This still does mean the setup is working correctly as a whole since traffic is being routed all the way to the app. You can also check your deployments with:
sudo kubectl get deployment
Replace deployment with pod to see the pod that is managed by the deployment. You can also try to play whack-a-mole with your pods to see how they get restarted automatically:
sudo kubectl delete pod -l "app=sensorhub"

Adding Persistent Volume

Pods can mount data that is maintained in the cluster by using persistent volume claims. This is a reasonable way to get the Flask instance folder stored between pod restarts, and is essentially the Kubernetes way for doing the volume mount we previously did from the command line with Docker directly. This process starts by defining a persistent volume claim, which you can do by adding the following snippet into the existing sensorhub-deployment.yml file.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sensorhub-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 32Mi
This volume claim has a name that can be referred in the configuration for containers. The access mode used here is the same one that is available in Rahti: ReadWriteOnce. It is rather limiting, allowing the volume to only be mounted into exactly one pod at a time. It is sufficient for our testing purposes however. Finally we assign this volume a small amount of disk space, as we are not really going to do anything with it.
In order to make use of this volume, a volume needs to be defined in the spec part of the file, and then it needs to be mounted in the Sensorhub container's configuration. First add the
volumes object before the containers key in your file
spec:
  ...
  template:
    ...
    spec:
      volumes:
      - name: instance-vol
        persistentVolumeClaim:
          claimName: sensorhub-pvc
The name here can be referred from a container configuration's volumeMounts. Add the following to the Sensorhub container's section:
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: sensorhub-testrun
        volumeMounts:
        - mountPath: /opt/sensorhub/instance
          name: instance-vol
Note that the path is different from when we were running Sensorhub in Docker earlier, since we are not running as root. After these additions your file should like this:
sensorhub-volume.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sensorhub-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 32Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sensorhub-test-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sensorhub
  template:
    metadata:
      labels:
        app: sensorhub
    spec:
      containers:
        - name: sensorhub-testrun
          image: docker.io/mioja/sensorhub:latest
          imagePullPolicy: Always
          ports:
          - containerPort: 8000
            protocol: TCP
          volumeMounts:
          - mountPath: /opt/sensorhub/instance
            name: instance-vol
        - name: nginx-testrun
          image: docker.io/mioja/sensorhub-nginx:latest
          imagePullPolicy: Always
          ports:
          - containerPort: 8080
            protocol: TCP
          env:
          - name: HOSTNAME
            value: localhost
      volumes:
      - name: instance-vol
        persistentVolumeClaim:
          claimName: sensorhub-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: sensorhub-entrypoint
  namespace: default
spec:
  type: NodePort
  selector:
    app: sensorhub
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 30001

Now you can re-apply the entire file and your pod should now have a nice persistent volume attached to it. Let's finally go inside the pod and run the familiar set of initialization commands to make the API actually do something. In order to so, we will execute a new shell inside the container. This is in general very useful if you need to debug what's going on inside your containers.
sudo kubectl exec sensorhub-test-deployment -it -c sensorhub-testrun -- sh
This gives us a shell session that conveniently has /opt/sensorhub as its working directory and we can run the three setup commands as always:
flask --app=sensorhub init-db
flask --app=sensorhub testgen
flask --app=sensorhub masterkey
Grab the master key to clipboard before exiting the shell session with Ctrl + D. Finally, let's access the entire thing with curl:
curl -H "Sensorhub-Api-Key: <copied api key>" localhost:30001/api/sensors/
Finally the fun part where we whack the pod down, wait for it to restart, and witness how the volume does indeed persist.
sudo kubectl delete pod -l "app=sensorhub"
curl -H "Sensorhub-Api-Key: " localhost:30001/api/sensors/
If everything went well, you should be getting the same result because the database contents are the same. Now we are essentially done with this local deployment, so if you don't want it running and restarting itself in the background anymore, you need to whack down the whole deployment. If you pass the file to delete, everything in it will be deleted. This includes your persistent volume contents so that database is now gone forever.
sudo kubectl delete -f sensorhub-deployment.yml
Are there better ways to initialize and populate your database than using a shell session in the container? Absolutely, but these are best figured out at another time. Like when you are using a real database instead of SQLite for instance. There is also a better way to set the API master key.

Moving to Rahti

In order to make this tutorial easier to follow, we are going to use the command line interface of OpenShift instead of trying to navigate the web UI. First thing to do is to download the binary and move it so that it's in the path.
wget https://downloads-openshift-console.apps.2.rahti.csc.fi/amd64/linux/oc.tar
tar -xf oc.tar
sudo mv oc /usr/local/bin/oc
rm oc.tar
After this you need to login from the console. In order to do this, go to the Rahti 2 web console and obtain a login token by copying the entire login command from the user menu.
The login command should look like this:
oc login https://api.2.rahti.csc.fi:6443 --token=<secret access token>
Note this token does expire so if you take more than a day to complete all of this, you need to obtain a new API token. Just repeat the above steps if oc asks you to login again. If it says you have multiple projects, please pick the correct one before doing anything else.
oc project pwp-deploy-tests
We have premade the necessary files for the deployments. They are in the templates folder of the NGINX repository that was used in the previous step. All you need to do is change some values.
First you need to change the names and labels in the Deployment and Service files in order to avoid conflicts with other groups. The fastest way to do this is to use sed. You should do this in the root of the Sensorhub NGINX repository. Similarly you need to change the HOSTNAME environment variable to the actual hostname Rahti assigns to your app. In order to make the rest of the command easy to copy-paste, we'll start by exporting your group's name in lowercase, hyphenated form (e.g. "My Amazing Group" would become "my-amazing-group"). If your group name is long, shorten it to reasonable size as the Rahti hostnames are looooong already.
export PWPGROUP=<groupname>
sed -i "s/sensorhub-nginx-deployment/$PWPGROUP-sensorhub/" templates/*.yaml
sed -i "s/value: localhost/value: $PWPGROUP-pwp-deploy-tests.2.rahtiapp.fi/" templates/Deployment.yaml
The next four commands should fire up your deployment and service, and also create a route.
oc create -f templates/Deployment.yaml
oc create -f templates/Service.yaml
oc create route edge $PWPGROUP --service=$PWPGROUP-sensorhub --insecure-policy="Redirect"
oc get route $PWPGROUP
The route should be the same that we asked you to replace localhost with. If it's not, you need to change it in the deployment file and then update your deployment with apply using oc apply -f templates/Deployment.yaml.
It takes a few seconds for Rahti to fire up your pod, but after that you should be able to get the index page with your browser or curl. If you get something else like 502 Bad Gateway, something is wrong with your configuration. First, check that both containers are running correctly with
oc get pod -l app=$PWPGROUP-sensorhub
Next, check NGINX logs with (change tail to larger value if you can't see anything useful),
oc logs -l app=$PWPGROUP-sensorhub -c nginx --tail=20
For further debug, get a shell into each of them with these commands:
oc exec deploy/$PWPGROUP-sensorhub -it -c nginx -- sh
oc exec deploy/$PWPGROUP-sensorhub -it -c sensorhub -- sh
The most likely cause is in the NGINX configuration files. Check them with cat to see if there's anything off. You can also edit them with vim (short guide: press i to go into insert mode, do what you need, press Esc and type :wq to save and exit) and then reload your configuration with nginx -s reload. If it starts working, great, do similar changes to your deployment file and try to re-apply to persist the changes.
Remember you would need to configure the sensorhub and get the adequate key. For that, access to the docker instance where gunicorn is running and execute the commands that you already know by heart:
flask --app=sensorhub init-db
flask --app=sensorhub testgen
flask --app=sensorhub masterkey
When you are ready remember to close the pods so you do not consume that many resources:
oc delete deployment $PWPGROUP-sensorhub
oc delete service $PWPGROUP-sensorhub
oc delete route $PWPGROUP

Extra: Using a Real Database

This section is considered extra information, but it is very useful. In this section you'll learn how to set get a PostgreSQL database from CSC's Pukki service, and how to configure your Sensorhub container in Rahti to use it instead of SQLite. Once again the tutorial uses command line tools to communicate with Pukki, so the first thing is to follow instructions on how to set them up. The instructions are adopted from CSC's documentation.

Getting a Database from Pukki

First, you'll want to go back to using the virtual environment that was used when we created a VM with OpenStack, because it already has the necessary tools installed. After you are in the correct virtual env, login to Pukki and obtain the OpenStack RC File from the right hand user menu:
Save the file somewhere inside your virtual environment with a nice name like pukki.sh, and source it with
source pukki.sh
You will be prompted for your CSC password. Check that it's properly connected e.g. with
openstack datastore list
At this point you are ready to create your database. We have already create the database instance (i.e. database server) that you should use, and it's called exercise_3_instance. The only thing for you to do is to create a database, and a user that has access to it. We are going to assume that you are still in the same shell session where the $PWPGROUP environment variable is defined. You should also generate a password for your database and, for following these instructions, also put it in an environment variable. We'll do it with Python's secrets module since we are already using Python, but there are other ways (IMPORTANT: do not keep passwords in environment variables in real production environments, we only do it here for convenience).
export DBPASS=$(python -c "import secrets; print(secrets.token_urlsafe(32))")
Now to create a database and your user for it
openstack database db create exercise_3_instance $PWPGROUP-sensorhub
openstack database user create exercise_3_instance $PWPGROUP $DBPASS --databases $PWPGROUP-sensorhub
If you want to test connecting to this database, you need to do it from a virtual machine in the course's cPouta cloud or from one of your Rahti containers because the database instance only allows connections from those sources. In both cases you need to install the PostgreSQL client utility. Regardless of whether you are going to test it manually first or not, you are going to need the database instance's IP address first. You can find it by looking at its details:
openstack database instance show exercise_3_instance
The address field has two objects, use the IP from the public one. To connect manually:
psql --host xxx.xxx.xxx.xxx --user $PWPGROUP $PWPGROUP-sensorhub
If you get your password prompted and a psql prompt afterward, everything should be set up correctly.

Configuring Flask Containers

In order to make the Sensorhub app use this database instead of its SQLite default, there needs to be a file called config.py in its instance folder. This file can override any of the default values that were used for development. We should override two: the secret key, and the database URI. However, before we move on to creating the configuration, let's talk briefly about how configuration files and secrets should be managed in Rahti.
Rahti uses two concepts for configuring containers: ConfigMaps and Secrets. ConfigMaps are reusable objects that can be either written into the container's environment as variables, or mounted as a volume with each key as a file. In our case we would like to mount a ConfigMap as the Flask app's instance folder so that it creates the config.py file there. Note that this approach is NOT compatible with mounting the instance folder from a persistent volume so if you were doing that, you need to remove the mount from your deployment.
Secrets are similar to ConfigMaps but they have some additional protection to prevent you from accidentally showing it to someone who is not supposed to see it. Anyone with access to the project's Rahti environment can still get the secret as it is only one base64 encoded, and can be displayed when specifically requested. You could put the whole configuration in a Secret really, but just to show you both, what we are going to do is drop a configuration file that reads all of the actual secrets from environment variables in the instance folder using CongfigMap, and then get the secrets into the environment variables.
The configuration file contents
import os
SECRET_KEY = os.environ.pop("FLASK_SECRET_KEY")
SQLALCHEMY_DATABASE_URI = (
    f"postgresql://{os.environ.pop("DB_USER")}:{os.environ.pop("DB_PASS")}@"
    f"{os.environ.pop("DB_HOST")}/{os.environ.pop("DB_NAME")}"
)
Since this ConfigMap doesn't have any group-specific information, you can just use the one that already exists in the course's Rahti. It will be included in your Deployment file in a bit. The secrets on the other hand are group-specific, so you will have to create your own secrets. Creating a secret YML file manually is a pain because you need to base64 encode all of your values for it, so a swifter alternative is to create them with the --from-env-fileargument. Prepare an env file like this, naming it secret, replacing the placeholders with your group's values.
SECRET_KEY=<some random string>
DB_USER=<username>
DB_PASS=<password>
DB_HOST=<ip>
DB_NAME=<database>
Then create a secret from this file using
oc create secret generic $PWPGROUP-secret --from-env-file secret
These are taken into use in your Deployment file by adding suitable volume to mount the ConfigMap, very similarly to how the shared volume was mounted earlier, and by defining environment variables that are taken from Secrets. Mouting the ConfigMap we already have in Rahti is done with this under spec:
spec:
  ...
  template:
    metadata: ...
    spec:
      volumes:
      - name: sensorhub-config
        configMap:
          name: sensorhub-configmap
          defaultmode: 400
Each it is then mounted just like a shared volume. The snippet below also shows how to get one of the secrets defined above into an environment variable inside the container.
spec:
  ...
  template:
    metadata: ...
    spec:
      ...
      containers:
      - image: docker.io/mioja/sensorhub-nginx:latest
        ...
      - image: docker.io/mioja/sensorhub:latest
        ...
        volumeMounts:
        - name: sensorhub-config
          readOnly: true
          mountPath: /opt/sensorhub/instance/config/
        env:
        - name: FLASK_SECRET_KEY
          valueFrom:
            secretKeyRef:
              name: test-secret
              key: SECRET_KEY
Note that the mount path goes to a subdirectory config inside the instance folder. This is because mounting to the instance folder itself would cause unnecessary dancing around with permissions when Flask wants to write into this folder for any reason (like when creating file system cache). The sensorhub in our Git repository has been modified to read configuration from this subdirectory instead of the instance root. The full Deployment file can be downloaded from below:
SensorhubDeployment.yml
Remember to change the app labels, secret names etc to match your group once again before applying this file, or take its changes and put them into your configuration file.
We showed the above way first because it shows multiple ways of doing things, not necessarily because it is the best way. Alternatively, you can also just create a configuration file that has all the information in it, and use that as a secret that you mount. This would involve simply changing the volume from defining a ConfigMap to mount to a Secret instead:
spec:
  ...
  template:
    metadata: ...
    spec:
      volumes:
      - name: sensorhub-config
        secret:
          secretName: my-secret-config
          defaultmode: 400
This approach has a few advantages, starting from just being more concise. There is no longer need to define multiple environment variables in the Deployment. It also doesn't put secrets into environment variables, which is generally preferred. Of course the file itself can be read by the user Gunicorn is running as, so if an attacker get access to your file system, it doesn't take them too much effort to reveal your secrets either. Either way, this is not a cybersecurity course, so please refer to other sources for further security considerations. The advantage of the previous approach was mostly that we can define one configuration file that can be shipped with the app itself, and will fill in blanks from environment variables. This would be more relevant with Django where configuration files are part of the application itself and are pushed into the repository.

Deploying in Containers

Similarly to the previous task, this one also only asks you to give access to the pod you set up to check it was done correctly.
Learning goals: Deploying a Flask app and NGINX in a single pod.

Task:
This time you need to set up the same API in Rahti's container platform, following the instructions above. Once again you should populate your database with test data, and generate a master key. For database you can use either SQLite like the main tutorial, or PostgreSQL like the extra part does.

Answer Format:
The answer format is the same as previously: write a JSON document that has two fields: host and apikey. This time host is domain name Rahti has given to your deployment. apikey is still the master key you generated for your app.

After You Are Done:
Remember to delete your Pod and its associated resources with:
oc delete deployment $PWPGROUP-sensorhub
oc delete service $PWPGROUP-sensorhub
oc delete route $PWPGROUP
Varoitus: Et ole kirjautunut sisään. Et voi vastata.
?
API Blueprint is a description language for REST APIs. Its primary categories are resources and their related actions (i.e. HTTP methods). It uses a relatively simple syntax. The advantage of using API Blueprint is the wide array of tools available. For example Apiary has a lot of features (interactive documentation, mockup server, test generation etc.) that can be utilized if the API is described in API Blueprint.
Another widely used alteranative for API Blueprint is OpenAPI.
Addressability is one of the key REST principles. It means that in an API everything should be presented as resources with URIs so that every possible action can be given an address. On the flipside this also means that every single address should always result in the same resource being accessed, with the same parameters. From the perspective of addressability, query parameters are part of the address.
Ajax is a common web technique. It used to be known as AJAX, an acronym for Asynchronous Javascript And XML but with JSON largely replacing XML, it become just Ajax. Ajax is used in web pages to make requests to the server without a page reload being triggered. These requests are asynchronous - the page script doesn't stop to wait for the response. Instead a callback is set to handle the response when it is received. Ajax can be used to make a request with any HTTP method.
  1. Kuvaus
  2. Examples
Anonymous functions are usually used as in-place functions to define a callback. They are named such because they are defined just like functions, but don't have a name. In JavaScript function definition returns the function as an object so that it can e.g. passed as an argument to another function. Generally they are used as one-off callbacks when it makes the code more readable to have the function defined where the callback is needed rather than somewhere else. A typical example is the forEach method of arrays. It takes a callback as its arguments and calls that function for each of its members. One downside of anonymous functions is that they function is defined anew every time, and this can cause significant overhead if performed constantly.
  1. Kuvaus
  2. Example
In Flask application context (app context for short) is an object that keeps tracks of application level data, e.g. configuration. You always need to have it when trying to manipulate the database etc. View functions will automatically have app context included, but if you want to manipulate the database or test functions from the interactive Python console, you need to obtain app context using a with statement.
Blueprint is a Flask feature, a way of grouping different parts of the web application in such a way that each part is registered as a blueprint with its own root URI. Typical example could be an admin blueprint for admin-related features, using the root URI /admin/. Inside a blueprint, are routes are defined relatively to this root, i.e. the route /users/ inside the admin blueprint would have the full route of /admin/users/.
Defines how data is processed in the application
Cross Origin Resource Sharing (CORS) is a relaxation mechanism for Same Origin Policy (SOP). Through CORS headers, servers can allow requests from external origins, what can be requested, and what headers can be included in those requests. If a server doesn't provide CORS headers, browsers will browsers will apply the SOP and refuse to make requests unless the origin is the same. Note that the primary purpose of CORS is to allow only certain trusted origins. Example scenario: a site with dubious script cannot just steal a user's API credentials from another site's cookies and make requests using them because the APIs CORS configuration doesn't allow requests from the site's origin. NOTE: this is not a mechanism to protect your API, it's to protect browser users from accessing your API unintentionally.
Callback is a function that is passed to another part of the program, usually as an argument, to be called when certain conditions are met. For instance in making Ajax requests, it's typical to register a callback for at least success and error situations. A typical feature of callbacks is that the function cannot decide its own parameters, and must instead make do with the arguments given by the part of the program that calls it. Callbacks are also called handlers. One-off callbacks are often defined as anonymous functions.
Piece of software that consumes or utilizes the functionality of a Web API. Some clients are controlled by humans, while others (e.g. crawlers, monitors, scripts, agents) have different degree of autonomy.
In databases, columns define the attributes of objects stored in a table. A column has a type, and can have additional properties such as being unique. If a row doesn't conform with the column types and other restrictions, it cannot be inserted into the table.
  1. Kuvaus
  2. Common keywords
In object relational mapping, column attributes are attributes in model classes that have been initialized as columns (e.g. in SQLAlchemy their initial value is obtained by initializing a Column). Each of these attributes corresponds to a column in the database table (that corresponds with the model class). A column attribute defines the column's type as well as additional properties (e.g. primary key).
  1. Kuvaus
  2. Example
In OpenAPI the components object is a storage for reusable components. Components inside this object can be referenced from other parts of the documentation. This makes it a good storage for any descriptions that pop up frequently, including path parameters, various schemas, and request body objects. This also includes security schemes if your API uses authentication.
Connectedness is a REST principle particularly related to hypermedia APIs. It states that there for each resource in the API, there must exist a path from every other resource to get there by following hypermedia links. Connectedness is easiest to analyze by creating an API state diagram.
Container is a virtualization concept where the virtualization is limited to the software layer. Unlike traditional virtual machines (VM) that virtualize a full computer from the hardware up, containers share operating system resources with each other and only provide an isolated run environment. Every container can define what is installed in its running environment, but they have less overhead than VMs. They are faster to work with and easy to replicate as well. Due to the shared hardware layer, it is possible for malicious containers to break out of their isolation and affect the hardware shared by all container running on the same system.
  1. Kuvaus
  2. Example
A hypermedia control is an attribute in a resource representation that describes a possible action to the client. It can be a link to follow, or an action that manipulates the resource in some way. Regardless of the used hypermedia format, controls include at least the URI to use when performing the action. In Mason controls also include the HTTP method to use (if it's not GET), and can also include a schema that describes what's considered valid for the request body.
  1. Kuvaus
  2. Example
  3. Using
A (URL) converter is a piece of code used in web framework routing to convert a part of the URL into an argument that will be used in the view function. Simple converters are usually included in frameworks by default. Simple converters include things like turning number strings into integers etc. Typically custom converters are also supported. A common example would be turning a model instance's identifier in the URL to the identified model instance. This removes the boilerplate of fetching model instances from view functions, and also moves the handling of Not Found errors into the converter.
The term credentials is used in authentication to indicate the information that identifies you as a specific user from the system's point of view. By far the most common credentials is the combination of username and password. One primary goal of system security is the protection of credentials.
Document Object Model (DOM) is an interface through which Javascript code can interact with the HTML document. It's a tree structure that follows the HTML's hierarchy, and each HTML tag has its own node. Through DOM manipulation, Javascript code can insert new HTML into anywhere, modify its contents or remove it. Any modifications to the DOM are updated into the web page in real time. Do note that since this is a rendering operation, it's very likely one of the most costly operations your code can do. Therefore changing the entire contents of an element at once is better than changing it e.g. one line at a time.
  1. Kuvaus
  2. systemctl
Daemons are processes that run independently on the background and are typically started by the system without any user interaction, or if started manually, they will keep running even if the user logs out as long as the system itself is running. The operating system and other programs can communicate with daemons through sockets, and their output is usually found in log files. Most daemons that are installed from package managers are controlled with systemctl. It's also useful to know that Supervisor allows running non-daemon processes as daemons.
Database schema is the "blueprint" of the database. It defines what tables are contained in the database, and what columns are in each table, and what additional attributes they have. A database's schema can be dumped into an SQL file, and a database can also be created from a schema file. When using object relational mapping (ORM), the schema is constructed from model classes.
  1. Kuvaus
  2. Example
Decorator is a function wrapper. Whenever the decorated function is called, its decorator(s) will be called first. Likewise, when the decorated function returns values, they will be first returned to the decorator(s). In essence, the decorator is wrapped around the decorated function. Decorators are particularly useful in web development frameworks because they can be inserted between the framework's routing machinery, and the business logic implemented in a view function. Decorators can do filtering and conversion for arguments and/or return values. They can also add conditions to calling the view function, like authentication where the decorator raises an error instead of calling the view function if valid credentials are not presented.
In HTML element refers to a single tag - most of the time including a closing tag and everything in between. The element's properties are defined by the tag, and any of the properties can be used to select that element from the document object model (DOM). Elements can contain other elements, which forms the HTML document's hierarchy.
For APIs entry point is the "landing page" of the API. It's typically in the API root of the URL hierarchy and contains logical first steps for a client to take when interacting with the API. This means it typically has one or more hypermedia controls which usually point to relevant collections in the API or search functions.
  1. Kuvaus
  2. Use (Linux)
Environment variables are values that are available in the running environment of a process, and are typically used to give processes information about the specific environment they are running in. This can include both configuration parameters, and information that is gathered from the operating system. They are most visible in shell sessions where all processes that are started from the shell inherit its environment variables but processes running without shell also have them, usually set when the process is started. Although sometimes used to replace configuration, their main purpose is to customize how a process is run temporarily. Using environment variables to store secrets is generally not advised, but sometimes done when other options are not available.
Their names are typically written in uppercase. One of the more notable variables is PATH, which indicates what directories should be searched for when the process tries to invoke an executable.
In software testing, a fixture is a component that satisfies the preconditions required by tests. In web application testing the most common role for fixtures is to initialize the database into a state that makes testing possible. This generally involves creating a fresh database, and possibly populating it with some data. In this course fixtures are implemented using pytest's fixture architecture.
  1. Kuvaus
  2. Creating DB
  3. Starting the App
This term contains basic instructions about setting up and running Flask applications. See the term tabs "Creating DB" and "Starting the App". For all instructions to work you need to be in the folder that contains your app.
In database terminology, foreign key means a column that has its value range determined by the values of a column in another table. They are used to create relationships between tables. The foreign key column in the target table must be unique.
For most hypermedia types, there exists a generic client. This is a client program that constructs a navigatable user interface based on hypermedia controls in the API, and can usually also generate data input forms. The ability to use such clients for testing and prototyping is one of the big advantages of hypermedia.
HTTP method is the "type" of an HTTP request, indicating what kind of an action the sender is intending to do. In web applications by far the most common method is GET which is used for retrieving data (i.e. HTML pages) from the server. The other method used in web applications is POST, used in submitting forms. However, in REST API use cases, PUT and DELETE methods are also commonly used to modify and delete data.
HTTP request is the entirety of the requets made by a client to a server using the HTTP protocol. It includes the request URL, request method (GET, POST etc.), headers and request body. In Python web frameworks the HTTP request is typically turned into a request object.
In computing a hash is a string that is calculated from another string or other data by an algorithm. Hashes have multiple uses ranging from encryption to encoding independent transmission. Hash algorithms can roughly be divided into one- and two-directional. One-directional hashing algorithms are not reversible - the original data cannot be calculated from the hash. They are commonly used to store passwords so that plain text passwords cannot be retrieved even if the database is compromised. Two-directional hashes can be reversed. A common example is the use of base64 to encode strings to use a limited set of characters from the ASCII range to ensure that different character encodings at various transmission nodes do not mess up the original data.
Headers are additional information fields included in HTTP requests and responses. Typical examples of headers are content-type and content-length which inform the receiver how the content should be interpreted, and how long it should be. In Flask headers are contained in the request.headers attribute that works like a dictionary.
Host part is the part of URL that indicates the server's address. For example, lovelace.oulu.fi is the host part. This part determines where (i.e. which IP address) in the world wide web the request is sent.
In API terminology hypermedia means additional information that is added on top of raw data in resource representations. It's derived from hypertext - the stuff that makes the world wide web tick. The purpose of the added hypermedia is to inform the client about actions that are available in relation to the resource they requested. When this information is conveyed in the representations sent by the API, the client doesn't need to know how to perform these actions beforehand - it only needs to parse them from the response.
An idempotent operation is an operation that, if applied multiple times with the same parameters, always has the same result regardless of how many times it's applied. If used properly, PUT is an idempotent operation: no matter how many times you replace the contents of a resource it will have the same contents as it would have if only one request had been made. On the other hand POST is usually not idempotent because it attempts to create a new resource with every request.
  1. Kuvaus
  2. Example
The info object in OpenAPI gives basic information about your API. This basic information includes general description, API version number, and contact information. Even more importantly, it includes license information and link to your terms of service.
Instance folder is a Flask feature. It is intended for storing files that are needed when running the Flask application, but should not be in the project's code repository. Primary example of this is the prodcution configuration file which differs from installation to installation, and generally should remain unchanged when the application code is updated from the repository. The instance path can be found from the application context: app.instance_path. Flask has a reasonable default for it, but it can also be set manually when calling Flask constuctor by adding the instance_path keyword argument. The path should be written as absolute in this case.
  1. Kuvaus
  2. Serializing / Parsing
JavaScript Object Notation (JSON) is a popular document format in web development. It's a serialized representation of a data structure. Although the representation syntax originates from JavaScript, It's almost identical to Python dictionaries and lists in formatting and structure. A JSON document conists of key-value pairs (similar to Python dictionaries) and arrays (similar to Python lists). It's often used in APIs, and also in AJAX calls on web sites.
JSON schema is a JSON document that defines the validity criteria for JSON documents that fall under the schema. It defines the type of the root object, and types as well as additional constraints for attributes, and which attributes are required. JSON schemas serve two purposes in this course: clients can use them to generate requests to create/modify resources, and they can also be used on the API end to validate incoming requests.
  1. Kuvaus
  2. Common MIME types
MIME type is a standard used for indicating the type of a document.In web development context it is placed in the Content-Type header. Browsers and servers the MIME type to determine how to process the request/response content. On this course the MIME type is in most cases application/json.
Microservice is a web architecture concept where a system consists of multiple smaller components called microservices, each exposing an API for communication with other components. Typically one microservice is responsibly for just one feature of the system. The main advantage is that each microservice can be developed indepedently of each other, even with entirely different programming languages and frameworks, and one team can be responsible of one microservce. This reduces side effects from changes that often occur in monolithic systems where the entire system is one application with highly coupled components. Microservices also allow the system to be scaled up in a more granular way and be more fault tolerant. Of course nothing comes wihtout a tradeoff. For microservices it is the increased need and complexity of communication and orchestration of the services.
Database migration is a process where an existing database is updated with a new database schema. This is done in a way that does not lose data. Some changes can be migrated automatically. These include creation of new tables, removal of columns and adding nullable columns. Other changes often require a migration script that does the change in multiple steps so that old data can be transformed to fit the new schema. E.g. adding a non-nullable column usually involves adding it first as nullable, then using a piece of code to determine values for each row, and finally setting the column to non-nullable.
  1. Kuvaus
  2. Example
In ORM terminology, a model class is a program level class that represents a database table. Instances of the class represent rows in the table. Creation and modification operations are performed using the class and instances. Model classes typically share a common parent (e.g. db.Model) and table columns are defined as class attributes with special constuctors (e.g. db.Column).
  1. Kuvaus
  2. Example
In API terminology, namespace is a prefix for names used by the API that makes them unique. The namespace should be a URI, but it doesn't have to be a real address. However, usually it is convenient to place a document that described the names within the namespace into the namespace URI. For our purposes, namespace contains the custom link relations used by the API.
Object relational mapping is a way of abstracting database use. Database tables are mapped to programming language classes. These are usually called models. A model class declaration defines the table's structure. When rows from the database table are fetched, they are represented as instances of the model class with columns as attributes. Likewise new rows are created by making new instances of the model class and committing them to the database. This course uses SQLAlchemy's ORM engine.
OpenAPI (previously: Swagger) is a description language for API documentation. It can be written with either JSON or YAML. An OpenAPI document is a single nested data structure which makes it suitable to be used with various tools. For example, Swagger UI is a basic tool that renders an OpenAPI description into a browsable documentation page. Other kinds of tools include using schemas in OpenAPI description for validation, and generating OpenAPI specification from live code.
  1. Kuvaus
  2. Example
Operation object is one of the main parts of an OpenAPI specification. It describes one operation on a resource (e.g. GET). The operation object includes full details of how to perform the operation, and what kinds of responses can be expected from it. Two of its key parameters are requestBody which shows how to make the request, and responses, which is a mapping of potential responses.
With Flasgger, an operation object can be put into a view method's docstring, or a separate file, to document that particular view method.
Pagination divides a larger dataset into smaller subsets called pages. Search engine results would be the most common example. You usually get 10 or 20 first hits from you search, and then have to request the next page in order to get more. The purpose of pagination is to avoid transferring (and rendering) unnecessary data, and it is particularly useful in scenarios where the relevance of data declines rapidly (like search results where the accuracy drops the further you go). An API that offers paginated data will typically offer access to specific pages with both absolute (i.e. page number) and relative (e.g. "next", "prev", "first" etc.) URLs. These are usually implemented through query parameters.
In OpanAPI a path parameter is a variable placeholder in a path. It is the OpenAPI equivalent for URL parameters that we use in routing. Path parameter typically has a description and a schema that defines what is considered valid for its value. These parameter definitions are often placed into the components object as they will be used in multiple resources. In OpenAPI syntax path parameters in paths are marked with curly braces, e.g. /api/sensors/{sensor}/.
In database terminology primary key refers to the column in a table that's intended to be the primary way of identifying rows. Each table must have exactly one, and it needs to be unique. This is usually some kind of a unique identifier associated with objects presented by the table, or if such an identifier doesn't exist simply a running ID number (which is incremented automatically).
Profile is metadata about a resource. It's a document intended for client developers. A profile gives meaning to each word used in the resource representation be it link relation or data attribute (also known as semantic descriptors). With the help of profiles, client developers can teach machine clients to understand resource representations sent by the API. Note that profiles are not part of the API and are usually served as static HTML documents. Resource representations should always contain a link to their profile.
In database terminology, query is a command sent to the database that can fetch or alter data in the database. Queries use written with a script-like language. Most common is the structured query language (SQL). In object relational mapping, queries are abstracted behind Python method calls.
  1. Kuvaus
  2. Example
Query parameters are additional parameters that are included in a URL. You can often see these in web searches. They are the primary mechanism of passing arbitrary parameters with an HTTP request. They are separated from the actual address by ?. Each parameter is written as a key=value pair, and they are separated from each other by &. In Flask applications they can be found from request.args which works like a dictionary.
  1. Kuvaus
  2. Examples
Regular expressions are used in computing to define matching patterns for strings. In this course they are primarily used in validation of route variables, and in JSON schemas. Typical features of regular expressions are that they look like a string of garbage letters and get easily out of hand if you need to match something complex. They are also widely used in Lovelace text field exercises to match correct (and incorrect) answers.
In this course request referes to HTTP request. It's a request sent by a client to an HTTP server. It consists of the requested URL which identifies the resource the client wants to access, a method describing what it wants to do with the resource. Requests also include headers which provide further context information, and possihby a request body that can contain e.g. a file to upload.
  1. Kuvaus
  2. Accessing
In an HTTP request, the request body is the actual content of the request. For example when uploading a file, the file's contents would be contained within the request body. When working with APIs, request body usually contains a JSON document. Request body is mostly used with POST, PUT and PATCH requests.
  1. Kuvaus
  2. Getting data
Request object is related to web development frameworks. It's a programming language object representation of the HTTP request made to the server. It has attributes that contain all the information contained within the request, e.g. method, url, headers, request body. In Flask the object can be imported from Flask to make it globally available.
in RESTful API terminology, a resource is anything that is interesting enough that a client might want to access it. A resource is a representation of data that is stored in the API. While they usually represent data from the database tables it is important to understand that they do not have a one-to-one mapping to database tables. A resource can combine data from multiple tables, and there can be multiple representations of a single table. Also things like searches are seen as resources (it does, after all, return a filtered representation of data).
Resource classes are introduced in Flask-RESTful for implementing resources. They are inherited from flask_restful.Resource. A resource class has a view-like method for each HTTP method supported by the resource (method names are written in lowercase). Resources are routed through api.add_resource which routes all of the methods to the same URI (in accordance to REST principles). As a consequence, all methods must also have the same parameters.
In this course we use the term representation to emphasize that a resource is, in fact, a representation of something stored in the API server. In particular you can consider representation to mean the response sent by the API when it receives a GET request. This representation contains not only data but also hypermedia controls which describe the actions available to the client.
In this course response refers to HTTP response, the response given by an HTTP server when a request is made to it. Reponses are made of a status code, headers and (optionally) response body. Status code describes the result of the transaction (success, error, something else). Headers provide context information, and response body contains the document (e.g. HTML document) returned by the server.
Response body is the part of HTTP response that contains the actual data sent by the server. The body will be either text or binary, and this information with additional type instructions (e.g. JSON) are defined by the response's Content-type header. Only GET requests are expected to return a response body on a successful request.
Response object is the client side counterpart of request object. It is mainly used in testing: the Flask test client returns a response object when it makes a "request" to the server. The response object has various attributes that represent different parts of an actual HTTP response. Most important are usually status_code and data.
In database terminology, rollback is the cancellation of a database transaction by returning the database to a previous (stable) state. Rollbacks are generally needed if a transaction puts the database in an error state. On this course rollbacks are generally used in testing after deliberately causing errors.
  1. Kuvaus
  2. Routing in Flask
  3. Reverse routing
  4. Flask-RESTful routing
URL routing in web frameworks is the process in which the framework transforms the URL from an HTTP request into a Python function call. When routing, a URL is matched against a sequence of URL templates defined by the web application. The request is routed to the function registered for the first matching URL template. Any variables defined in the template are passed to the function as parameters.
In relational database terminology, row refers to a single member of table, i.e. one object with properties that are defined by the table's columns. Rows must be uniquely identifiable by at least one column (the table's primary key).
SQL (structured query language) is a family of languages that are used for interacting with databases. Queries typically involve selecting a range of data from one or more tables, and defining an operation to perform to it (such as retrieve the contents).
Serialization is a common term in computer science. It's a process through which data structures from a program are turned into a format that can be saved on the hard drive or sent over the network. Serialization is a reversible process - it should be possible to restore the data structure from the representation. A very common serialization method in web development is JSON.
In web applications static content refers to content that is served from static files in the web server's hard drive (or in bigger installations from a separate media server). This includes images as well as javascript files. Also HTML files that are not generated from templates are static content.
Swagger is a set of tools for making API documentation easier. In this course we use it primarily to render easily browsable online documentation from OpenAPI description source files. Swagger open source tools also allow you to run mockup servers from your API description, and there is a Swagger editor where you can easily see the results of changes to your OpenAPI description in the live preview.
In this course we use Flasgger, a Swagger Flask extension, to take render API documentation.
  1. Kuvaus
  2. Creating
System user is an operating system concept, particularly in UNIX systems, for users that exist for the sole purpose of processes taking their identity when run. Unlike normal users, they do not have a password and cannot be logged in as. Their primary purpose is to manage permissions so that each process only has access to resources it actually needs for operation, thus reducing the amount of information that an attacker can access if they are able to take control of the process.
In database terminology, a table is a collection of similar items. The attributes of those items are defined by the table's columns that are declared when the table is created. Each item in a table is contained in a row.
In software testing, test setup is a procedure that is undertaken before each test case. It prepares preconditions for the test. On this course this is done with pytest's fixtures.
In software testing, test teardown is a process that is undertaken after each test case. Generally this involves clearing up the database (e.g. dropping all tables) and closing file descriptors, socket connections etc. On this course pytest fixtures are used for this purpose.
Universal resource identifier (URI) is basically what the name says: it's a string that unambiguously identifies a resource, thereby making it addressable. In APIs everything that is interesting enough is given its own URI. URLs are URIs that specify the exact location where to find the resource which means including protocol (http) and server part (e.g. lovelace.oulu.fi) in addition to the part that identifies the resource within the server (e.g. /ohjelmoitava-web/programmable-web-project-spring-2019).
  1. Kuvaus
  2. Type converters
  3. Custom converters
URL template defines a range of possible URLs that all lead to the same view function by defining variables. While it's possible for these variables to take arbitrary values, they are more commonly used to select one object from a group of similar objects, i.e. one user's profile from all the user profiles in the web service (in Flask: /profile/<username>. If a matching object doesn't exist, the default response would be 404 Not Found. When using a web framework, variables in the URL template are usually passed to the corresponding view function as arguments.
Uniform interface is a REST principle which states that all HTTP methods, which are the verbs of the API, should always behave in the same standardized way. In summary:
  • GET - should return a representation of the resource; does not modify anything
  • POST - should create a new instance that belongs to the target collection
  • PUT - should replace the target resource with a new representation (usually only if it exists)
  • DELETE - should delete the target resource
  • PATCH - should describe a change to the resource
In database terminology, unique constraint is a what ensures the uniqueness of each row in a table. Primary key automatically creates a unique constraint, as do unique columns. A unique constraint can also be a combination of columns so that each combination of values between these columns is unique. For example, page numbers by themselves are hardly unique as each book has a first page, but a combination of book and page number is unique - you can only have one first page in a book.
  1. Kuvaus
  2. Registering
View functions are Python functions (or methods) that are used for serving HTTP requests. In web applications that often means rendering a view (i.e. a web page). View functions are invoked from URLs by routing. A view function always has application context.
  1. Kuvaus
  2. Creation
  3. Activation
A Python virtual environment (virtualenv, venv) is a system for managing packages separately from the operating system's main Python installation. They help project dependency management in multiple ways. First of all, you can install specific versions of packages per project. Second, you can easily get a list of requirements for your project without any extra packages. Third, they can placed in directories owned by non-admin users so that those users can install the packages they need without admin privileges. The venv module which is in charge of creating virtual environments comes with newer versions of Python.
Web Server Gateway Interface, WSGI (pronounced whiskey because no one wants to read that abbreviation aloud) is a Python specification that defines how web servers can communicate with Python applications so that an HTTP request gets converted into a Python function call, and the return value of the call is converted into an HTTP response. Its main purpose is to make it easy to create web applications with Python and have them work uniformly. It also has an asynchronous cousin in ASGI if you feel like your application will benefit from using an asynchronous web framework (like FastAPI).
Interface, implemented using web technologies, that exposes a functionality in a remote machine (server). By extension Web API is the exposed functionality itself.
Web server is an application that listens to HTTP and HTTPS traffic, and defines how it is responded to. Typical behaviors can include serving a static file, routing the request to a web application, or routing it to another web server. Web servers can also function as load balancers that distribute traffic to multiple server nodes in order to be able to serve a higher amount of clients simultaneously. When deploying web applications behind web servers, typically the web server takes care of handling encryption, provided the application runs on the same machine as the web server. Presently, the most common web servers are Apache and NGINX.
  1. Kuvaus
  2. Example
YAML (YAML Ain't Markup Language) is a human-readable data serialization language that uses a similar object based notation as JSON but removes a lot of the "clutter" that makes JSON hard to read. Like Python, YAML uses indentation to distinguish blocks from each other, although it also supports using braces for this purpose (which, curiously enough, makes JSON valid YAML). It also removes the use of quotation characters where possible. It is one of the options for writing OpenAPI descriptions, and the one we are using on this course.