This is a very large refactoring which aims at making Tutor both more
extendable and more generic. Historically, the Tutor plugin system was
designed as an ad-hoc solution to allow developers to modify their own
Open edX platforms without having to fork Tutor. The plugin API was
simple, but limited, because of its ad-hoc nature. As a consequence,
there were many things that plugin developers could not do, such as
extending different parts of the CLI or adding custom template filters.
Here, we refactor the whole codebase to make use of a generic plugin
system. This system was inspired by the Wordpress plugin API and the
Open edX "hooks and filters" API. The various components are added to a
small core thanks to a set of actions and filters. Actions are callback
functions that can be triggered at different points of the application
lifecycle. Filters are functions that modify some data. Both actions and
filters are collectively named as "hooks". Hooks can optionally be
created within a certain context, which makes it easier to keep track of
which application created which callback.
This new hooks system allows us to provide a Python API that developers
can use to extend their applications. The API reference is added to the
documentation, along with a new plugin development tutorial.
The plugin v0 API remains supported for backward compatibility of
existing plugins.
Done:
- Do not load commands from plugins which are not enabled.
- Load enabled plugins once on start.
- Implement contexts for actions and filters, which allow us to keep track of
the source of every hook.
- Migrate patches
- Migrate commands
- Migrate plugin detection
- Migrate templates_root
- Migrate config
- Migrate template environment globals and filters
- Migrate hooks to tasks
- Generate hook documentation
- Generate patch reference documentation
- Add the concept of action priority
Close #499.
Previously, the `k8s exec` command did not support unknown "--options". This
made it impossible to launch, say, a django shell in the lms container.
While implementing this feature we saw an opportunity to simplify the way jobs
are handled in the k8s commands.
Close #636.
Another related issue is: https://github.com/overhangio/2u-tutor-adoption/issues/52
`upgrade` had several issues, which are summarized here:
https://discuss.overhang.io/t/confusing-instructions-during-upgrade/2281/7
- The docs say that you should run quickstart, but what most people will see is
the big command tutor local upgrade --from=lilac verbatim paragraph.
- The local upgrade command should be very explicit about the fact that users
need to run quickstart.
- Maybe the name of the local upgrade command should be improved.
- When upgrading tutor from one major release to the next, there should be a
more explicit warning to inform users of what they are doing (see this other
conversation 1)
- We should tell people that they almost certainly need to enable the tutor and
the mfe plugins, if they are not enabled during upgrade.
- A link to all of the breaking changes from the changelog should be
prominently displayed during upgrade.
- The docs should emphasize that upgrading from one major release to the next
is potentially a risky endeavor and that downgrading is not possible. The docs
should also link to the changelog.
This commit has grown slightly beyond the intended scope, but the changes should be mostly positive.
When upgrading from Lilac, all services break with the following error:
Service "***" is invalid: spec.ports[0].nodePort: Forbidden: may not be used when `type` is 'ClusterIP'
Upgrading deployments fails as well:
Deployment.apps "***" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-********", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"***", "app.kubernetes.io/part-of":"openedx"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
That's because deployments and services need to be deleted as part of the Maple
upgrade. So that's what we do as part of `tutor k8s upgrade --from=lilac`. And
we take the opportunity to:
1. Run upgrade as part of quickstart, when necessary.
2. Default to lilac during `tutor k8s upgrade`.
Close #551.
- A shared cookie domain between lms and cms is no longer recommended:
https://github.com/edx/edx-platform/blob/master/docs/guides/studio_oauth.rst
- refactor: clean mounted data folder in lms/cms. In Lilac, the
bind-mounted lms/data and cms/data folders are a mess because new
folders are created there for every new course organisation. These
folders are empty. As far as we know they are useless... With this
change we move these folders to a dedicated "modulestore" subdirectory;
which corresponds better to the initial intent of the fs_root setting.
- fix: frontend failure during login to the lms. See:
https://github.com/openedx/build-test-release-wg/issues/104
- feat: move all forum-related code to a dedicated plugin. Forum is an
optional feature, and as such it deserves its own plugin. Starting from
Maple, users will be able to install the forum from
https://github.com/overhangio/tutor-forum/
- migrate from DCS_* session cookie settings to SESSION_*. That's
because edx-platform no longer depends on django-cookies-samesite. Close
https://github.com/openedx/build-test-release-wg/issues/110
- get rid of tons of deprecation warnings in the lms/cms
- feat: make it possible to point to themed assets. Cherry-picking this
change makes it possible to point to themed assets with a theme-agnostic
url, notably from MFEs.
- Install all official plugins as part of the `tutor[full]` package.
- Don't print error messages about loading plugins during autocompletion.
- Prompt for image building when upgrading from one release to the next.
- Add `tutor local start --skip-build` option to skip building Docker images.
Close #450.
Close #545.
This introduces quite a few changes to make it easier to run Caddy as a load
balancer in Kubernetes:
- Make it possible to start/stop a selection of resources with ``tutor k8s
start/stop [names...]``.
- Make it easy to deploy an independent LoadBalancer by converting the caddy
service to a NodePort when ``ENABLE_WEB_PROXY=false``.
- Add a ``app.kubernetes.io/component: loadbalancer`` label to the LoadBalancer
service.
- Add ``app.kubernetes.io/name`` labels to all services.
- Preserve the LoadBalancer service in ``tutor k8s stop`` commands.
- Wait for the caddy deployment to be ready before running initialisation jobs.
Close #532.
Previously, configuration management was very confusing because we kept mixing
"base" and "defaults" configuration:
- It was difficult to make the difference between core settings that were
necessary (e.g: passwords) as opposed to others that could simply be
defaulted to.
- The order of settings in config.yml mattered: config entries that depended on
other needed to be defined later. As a consequence, Tutor was not compatible
with Python 3.5, where dict entries are not sorted.
Forum is an optional feature, and as such it deserves its own plugin. Starting
from Maple, users will be able to install the forum from
https://github.com/overhangio/tutor-forum/
Close #450.
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.
To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.
We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.
With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.
Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.
We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.
In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.
Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.
Close #323.
Previously, job declarations were always loaded from local/docker-compose.yml
and local/docker-compose.jobs.yml. This meant that it was not possible to
override job declarations in dev mode. It is now the case, with
dev/docker-compose.jobs.yml and dev/docker-compose.jobs.override.yml. Neither
of these files exist yet... But who knows? we might need this feature one day.
In any case the code is much cleaner now.
Before, custom `docker_compose_func` arguments had to be passed to job runners.
This was not very elegant. Also, it prevented us from loading custom job files
in development.
Here, we adopt a better object-oriented approach, where context classes are
ordered hierarchically.
This paves the way for loading `dev/docker-compose.jobs.yml` files in `tutor
dev init` commands -- which will be necessary to fix permissions in dev/local
mode.
Previously, we were building all images every time we ran a "local start"
command. This was causing unnecessary rebuild. Here, instead, we make use of
the `docker-compose up --build`. This means that only the required images will
be rebuilt.
Limits the memory chek to the 'local quickstart' command, makes error
handling more accurate and adds warning messages for some conditions.
Also adds a mention of this in troubleshooting.rst.
Adds a check in the 'local' command group that requires at least
4 GB of RAM to be allocated to Docker when running any of the
local subcommands on macOS. This addresses a common issue where
Docker's default setting (2 GB) causes startup to crash with
misleading error messages.
Here, we make it possible to automatically append a suffix to the version and app
name (in the sense of appdirs). This guarantees that a tutor edge project will
not accidentally override another community release.
In addition, we take the opportunity to document the tutor versioning format.
(I've been meaning to do that for a long time)
Previously, the list of domain names to which a theme was assigned had to be
specified manually. Now, the themes are automatically assigned to the LMS and
the CMS, both in development and production modes.
When upgrading mongodb, the mongodb container takes a little while to become
ready. Running the "exec" command thus triggers an error:
docker-compose -f /path/to/env/local/docker-compose.yml -f /path/to/env/local/docker-compose.prod.yml --project-name tutor_local exec mongodb mongo --eval db.adminCommand({ setF
eatureCompatibilityVersion: "4.0" })
MongoDB shell version v4.0.24
connecting to: mongodb://127.0.0.1:27017/?gssapiServiceName=mongodb
2021-06-14T10:53:21.510+0000 E QUERY [js] Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused:
connect@src/mongo/shell/mongo.js:356:17
@(connect):2:6
exception: connect failed
Error: Command failed with status 1: docker-compose -f /path/to/env/local/docker-compose.yml -f /path/to/env/local/docker-compose.prod.yml --project-name tutor_local exec mongodb mongo --eval db.adminCommand({ setFeatureCompatibilityVersion: "4.0" })
We add a "sleep" statement to the upgrade process to ensure that the mongodb
container is available.
An issue with the latest release of docker-compose was reported here:
https://discuss.overhang.io/t/undefined-entrypoint-throws-error-in-docker-compose-2-0-0-beta-4/1716
The mysql-job definition had an empty entrypoint (`[]`). This was causing the following error:
the initiation of mysql fails with “services.mysql-job.entrypoint must be a string …
Error: Command failed with status 15”
I can't remember at all why we had to define an empty entrypoint. It probably
has to do with the fact that we could not run `sh -e -c "..."` commands in
mysql jobs. Similarly, the k8s job definition sets `command: []`. I tested both
local and k8s deployments without these definitions and they work just fine. So
I guess we can get rid of them.
When disable a plugin that set config entried, such as the minio plugin, tutor was logging the following:
Disabling plugin minio...
Removed config entry OPENEDX_AWS_ACCESS_KEY=openedx
Removed config entry OPENEDX_AWS_SECRET_ACCESS_KEY={{ MINIO_AWS_SECRET_ACCESS_KEY }}
Plugin disabled
The config values were not rendered during printing, which is a shame, because
the whole point of this log line is to warn users of passwords/secrets that are
being removed. Here, we make sure that the config values are properly rendered.
The new logs are now:
Disabling plugin minio...
Removing config entry OPENEDX_AWS_ACCESS_KEY=openedx
Removing config entry OPENEDX_AWS_SECRET_ACCESS_KEY=64vpCVLxhDxBuNjakSrX4CQg
Plugin disabled
In most cases, it makes very little sense to edit the namespace that an
application is running in. Quite often, users are granted access to just one
namespace and don't have the necessary rights to edit the namespace -- and for
good security reasons. In such cases, the k8s namespace object already exists
and there is no need for the user to edit or create it. Here, what we do is
that we create the namespace only if it does not exist. This should solve quite
a few permission issues, notably for Openshift users.
When running `tutor local quickstart -p` we were getting the following error:
Usage: custom [OPTIONS] ARGS...
Try 'custom --help' for help.
Error: Missing argument 'ARGS...'.
The docker-compose command sometimes accept a single command ("pull") with zero
argument.
See: https://discuss.overhang.io/t/local-quickstart-not-working-when-pullimages-enabled/1526
I stumbled upon a bug that should have been detected by the type
checking. Turns out, considering that config is of type Dict[str, Any]
means that we can use just any method on all config values -- which is
terrible. I discovered this after I set `config["PLUGINS"] = None`:
this triggered a crash when I enabled a plugin.
We resolve this by making the Config type more explicit. We also take
the opportunity to remove a few cast statements.