REST API Design Example¶

This page gives you another design example for a REST API. The API designed in this example will also be discussed more in the last two exercises where we add documentation and hypermedia to it, and finally make a machine client in the last exercise.

Please note that this material has been written with the expectation that the reader is familiar with RESTful API concepts from the lecture materials or relevant course book chapters.

API Concept¶

The API concept in this material is a service that stores metadata about music. The metadata is split into three levels: artist, album and track. The API can be used to enrich music-related data from other sources, and can also be used to fill in partial metadata of a poorly managed music collection. This example is interesting because the problem domain has some peculiar characteristics that require additional design considerations. It's also a good API example because its primary clients are machines.

The data schema is not particularly large: artists are authors of albums which contain tracks. So we have a clear hierarchy that is easy enough to represent in a database.

Challenges¶

The first challenge related to the problem domain is that names of artists, albums or tracks are not generally not trademarkable. In other words they are not unique. You can find multiple artists - sometimes even from the same country - with the exact same name. The same goes for album names. On the other hand, artists typically don't have multiple albums with the same name. Usually albums don't have multiple tracks with the same name either but there's an exception: there can be multiple untitled tracks on an album. One way or another our API needs to navigate this non-uniqueness mess.

The second challenge is the existence of "various artists" releases (VA for short), i.e. collaborative works. These are albums that have multiple artists, with one or more tracks from each. With these releases each track has to have its artist defined separately unlike normal releases where all tracks on an album are by the same artist. So although these two types of releases are very similar, they are not identical and will require some degree of differing treatment.

Related Services¶

This example API provides similar data as two free services for music metadata: Musicbrainz and FreeDB. These services are often used when ripping audio files from CDs because they have a CD checksum lookup for metadata. Of course our example is more limited but it is a RESTful API unlike these two. There's also Rate Your Music which offers meta information for human users.

An example of a data source that can be used with this API is last.fm which is a tracking site for your personal music listening. It has a lot of metadata of its own but one tragic failing: it is unable to track listening time accurately as the primary statistics are listening counts per track. This means that the statistics are biased towards artists that have shorter average track length. Not to worry, last.fm has its own API. It would be possible to pull data from there and combine with the length metadata from our proposed API!

Database Design¶

From our concept we can easily come up with a database that has three

models

: album, artist and track. However we also need to consider the VA exception when designing these models, so we actually have two additional item types to represent: VA album and VA track. We also need to figure out

unique constraints

for each model. Although everything in the database has a unique primary key, you should never use raw database IDs to address resources in an API. First of all they don't mean anything. Second, it introduces vulnerabilities for APIs that don't want unauthorized clients to infer details about the content.

Unique constraint allows us to define more complex definitions of uniqueness than just defining individual columns as unique. A combination of multiple columns can be made into a unique constraint so that a certain combination of values in these columns can only appear once. For example, we can probably assume that the same artist is not going to have multiple albums with the same name (we're not counting separate editions). So while album title by itself cannot be unique, album title combined with the artist ID foreign key can.

def Album(db.Model):
    __table_args__ = (db.UniqueConstraint("title", "artist_id", name="_artist_title_uc"), )

Please note the comma at the end: this tells Python that this is a 1 item tuple, not a single value in regular parenthesis. You can list as many as column names for the unique constraint as you want. The name at the end doesn't matter, but has to exist, so better make it descriptive. For individual tracks we have an even better unique constraint: each album can only have one track at each disc index (per disc). So the unique constraint for tracks is a combinaton of album ID, track number and disc number.

def Track(db.Model):
    __table_args__ = (db.UniqueConstraint("disc_number", "track_number", "album_id", name="_track_index_uc"), )

We're going to solve the VA problem by allowing an album's artist foreign key to be null, and by adding an optional va_artist field to tracks. We'll make this mandatory for VA tracks on the application logic side later. Overall our database code ends up looking like this:

models.py

from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy.engine import Engine
from sqlalchemy import event
from sqlalchemy.exc import IntegrityError, OperationalError

app = Flask(__name__, static_folder="static")
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///development.db"
app.config["SQLALCHEMY_TRACK_MODIFICATIONS"] = False
db = SQLAlchemy(app)

@event.listens_for(Engine, "connect")
def set_sqlite_pragma(dbapi_connection, connection_record):
    cursor = dbapi_connection.cursor()
    cursor.execute("PRAGMA foreign_keys=ON")
    cursor.close()

va_artist_table = db.Table(
    "va_artists", 
    db.Column("album_id", db.Integer, db.ForeignKey("album.id"), primary_key=True),
    db.Column("artist_id", db.Integer, db.ForeignKey("artist.id"), primary_key=True)
)


class Track(db.Model):
    
    __table_args__ = (db.UniqueConstraint(
        "disc_number",
        "track_number",
        "album_id",
        name="_track_index_uc"),
    )
    
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String, nullable=False)
    disc_number = db.Column(db.Integer, default=1)
    track_number = db.Column(db.Integer, nullable=False)
    length = db.Column(db.Time, nullable=False)
    album_id = db.Column(
        db.ForeignKey("album.id", ondelete="CASCADE"),
        nullable=False
    )
    va_artist_id = db.Column(
        db.ForeignKey("artist.id", ondelete="SET NULL"),
        nullable=True
    )
    
    album = db.relationship("Album", back_populates="tracks")
    va_artist = db.relationship("Artist")

    def __repr__(self):
        return "{} <{}> on {}".format(self.title, self.id, self.album.title)
    
    
class Album(db.Model):
    
    __table_args__ = (db.UniqueConstraint(
        "title",
        "artist_id",
        name="_artist_title_uc"),
    )
    
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String, nullable=False)
    release = db.Column(db.Date, nullable=False)
    artist_id = db.Column(
        db.ForeignKey("artist.id", ondelete="CASCADE"),
        nullable=True
    )
    genre = db.Column(db.String, nullable=True)
    discs = db.Column(db.Integer, default=1)
    
    artist = db.relationship("Artist", back_populates="albums")
    va_artists = db.relationship("Artist", secondary=va_artist_table)
    tracks = db.relationship("Track",
        cascade="all,delete",
        back_populates="album",
        order_by=(Track.disc_number, Track.track_number)
    )
    
    sortfields = ["artist", "release", "title"]
    
    def __repr__(self):
        return "{} <{}>".format(self.title, self.id)


class Artist(db.Model):
    
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, nullable=False)
    unique_name = db.Column(db.String, nullable=False, unique=True)
    formed = db.Column(db.Date, nullable=True)
    disbanded = db.Column(db.Date, nullable=True)
    location = db.Column(db.String, nullable=False)
    
    albums = db.relationship("Album", cascade="all,delete", back_populates="artist")
    va_albums = db.relationship("Album",
        secondary=va_artist_table,
        back_populates="va_artists",
        order_by=Album.release
    )

    def __repr__(self):
        return "{} <{}>".format(self.name, self.id)

From it you can also see how to set default ordering for relationships and a couple of other things that weren't covered in Exercise 1.

Resource Design¶

We're now ready to design the

resources

provided by our API. An important takeaway from this section is how we turn three database

models

into (way) more than three resources. We will also explain how

HTTP methods

are used in this API by following REST principles.

Resources from Models¶

A resource should be something that is interesting enough to be given its own

URI

in our service. Likewise each resource must be uniquely identifiable by its URI. It's quite common for an API to have at least twice as many resources as it has database

tables

. This follows from a simple reasoning: for each table, a client might be interested in the table as a collection, or just in an individual

row

in the table. Even if the collection

representation

has all the stored data about each of its items, the item representation must also exist if we want to enable clients to manipulate them.

If we were to follow this very simple reasoning, we'd have 6 resources:

artist collection
artist item
album collection
album item
track collection
track item

It is worth noting that a collection type resource doesn't necessarily have to contain the entire contents of the associated table. For instance contextless track collection makes very little sense; a collection of tracks by album makes more sense. In fact an album is a collection of tracks, so having a separate track collection resource might not even make sense. Artist collection is simple enough because artist is on top of the hierarchy, so it makes sense for the collection to have all artists. What about albums though? Like tracks, it does make sense to have "albums by an artist" as a collection resource. But we also have VA albums to worry about. We can make two collection resources: one for an artist's albums and another for VA albums. We end up with:

artist collection
artist item
albums by artist collection
VA albums collection
album item (incorporates track collection)
track item

However, we have slightly different representation for VA albums compared to normal albums, and same goes for tracks. Even though we chose the same

model

to represent both, they do have ever so slightly different requirements to be valid: for normal albums, we must know the artist; for VA album tracks we must know the track artist. So it would be fair to say that these are in fact separate

representations

that should be added as resources. Finally let's add a collection of all albums so that clients can see what albums our API has data for.

artist collection
artist item
all albums collection
albums by artist collection
VA albums collection
album item (incorporates track collection)
VA album item (incorpotes VA track collection)
track item
VA track item

Bonus consideration: Why are we incorporating track collection into album, but not incorporating album colletion into artist? Mostly because artist as a concept is more than a collection of albums. For example artist could also be a collection of people (band members). The API should state what it means explicitly, and therefore it is better to separate "artist" from "albums by artist".

Routing Resources¶

After identifying what's considered important enough (and different enough) to be regarded as its own

resource

, we now have to come up with

URIs

so that each can be uniquely identified (

addressability principle

). This also defines our URI hierarchy. We want the URIs to convey the relationships between our resources. For normal albums the hierarchy goes like this:

artist collection
└── artist
    └── album collection
        └── album
            └── track

We decided that album title paired with artist ID is sufficient for

uniqueness

. We also decided that the best way to uniquely identify a track is to use its position an the album as an index consisting of disc and track numbers. Taking all this into account, we end up with a route that looks like this:

/api/artists/{artist_unique_name}/albums/{album_title}/{disc}/{track}/

This uniquely identifies each track, and also clearly shows the hierarchy. All the intermediate resources (both collections and items) can be found by dropping off parts from the end. We will separate VA albums from the rest by using VA to replace {artist}, ending up with this route to identify each VA track:

/api/artists/VA/albums/{album_title}/{disc}/{track}/

Then we need to add one more separate branch to the URI tree for the collection that shows all albums:

/api/albums/

The entire URI tree becomes:

api
├── artists
│   ├── {artist}
│   │   └── albums
│   │       └── {album}
│   │           └── {disc}
│   │               └── {track}
│   └── VA
│       └── albums
│           └── {album}
│               └── {disc}
│                   └── {track}
└── albums

Resource Actions¶

Following REST principles our API should offer actions as

HTTP methods

targeted at resources. To reiterate, each HTTP method should be used as follows:

GET - should return a representation of the resource; does not modify anything
POST - should create a new instance that belongs to the target collection
PUT - should replace the target resource with a new representation (only if it exists)
DELETE - should delete the target resource
PATCH - should describe a change to the resource - generally not recommended, see extra chapter

Most resources should therefore implement GET. Collection types usually implement POST whereas PUT and DELETE are typically attached to individual items. In our case we make two exceptions: first, as album serves as both an item and as a collection, it actually implements all four; second, the albums resource at the bottom of the URI tree above should not provide POST because there is no way of knowing from the URI which artist is the author. The parent of a new item should always be found from the URI - not in the

request body

. We're not using PATCH in this example.

Gathering everything into a table:

Resource	URI	GET	POST	PUT	DELETE
artist collection	/api/artists/	X	X	-	-
artist item	/api/artists/{artist}/	X	-	X	X
albums by artist	/api/artists/{artist}/albums/	X	X	-	-
albums by VA	/api/artists/VA/albums/	X	X	-	-
all albums	/api/albums/	X	-	-	-
album	/api/artists/{artist}/albums/{album}/	X	X	X	X
VA album	/api/artists/VA/albums/{album}/	X	X	X	X
track	/api/artists/{artist}/albums/{album}/{disc}/{track}/	X	-	X	X
VA track	/api/artists/VA/albums/{album}/{disc}/{track}/	X	-	X	X

Since we are following the

uniform interface

REST principle and each HTTP method does what it's expected to, this table actually tells a lot about our API: it shows every possible

HTTP

request that can be made and even hints at their meaning: if you send a PUT request to a track resource, it will modify the track's data (even more specifically it will replace all data with what's in the request body). It just doesn't do a very good job of explaining what requests and responses should look like.

Data Representation¶

Our API communicates in

JSON

. There isn't a whole lot to data representation really, it's a rather straightforward

serialization

process from

model

instance attributes to JSON attributes. If the client sends a GET request to, say, /api/artists/scandal/ the data that is returned would be serialized into this:

{
    "name": "Scandal",
    "unique_name": "scandal",
    "location": "Osaka, JP",
    "formed": "2006-08-21",
    "disbanded": null
}

Likewise if the client wants to add a new artist, they'd send almost an identical JSON document, sans unique_name because it is generated by the API server. A similar serialization process can be applied for all models. Collection type resources will have "items" attribute which is an array containing objects that are part of the collection. Most notably albums have both root level data about the album itself, and an array of tracks. It's also worth noting that collection types don't necessarily have to include all the data about their members. For example in album collections we have deemed it sufficient to show album title and artist name:

{
    "items": [
        {
            "artist": "Scandal",
            "title": "Hello World"
        },
    ]
}

If the client wants more information about the album, it can always send a GET to the album resource itself.

Conclusion¶

This sort example should have shown you how concepts in the design of an API first become database tables, and ultimately resources that can be offered through the API. In upcoming exercises this API will also be able to give instructions to clients via hypermedia, and we will also give an example of how a machine client can take advantage of it.

from werkzeug.exceptions import NotFound from werkzeug.routing import BaseConverter class SensorConverter(BaseConverter): def to_python(self, sensor_name): db_sensor = Sensor.query.filter_by(name=sensor).first() if db_sensor is None: raise NotFound return db_sensor def to_url(self, db_sensor): return db_sensor.name