<rant>I attempted to actually run Tutor with Podman and I was sorely disappointed.
The only reliable source of docs that I found concerning the integration with
docker-compose is this blog post:
https://www.redhat.com/sysadmin/podman-docker-compose
There are no other official docs 😓
1. The instructions given in the blog post don't work out of the box. Launching
the podman service failed altogether on Ubuntu 20.04 and 20.10. It worked on
CentOS 8, but some parameters need to changed, such as the docker socket path.
2. After I got the podman service working, I managed to get an Open edX
platform running with tutor, but with the root user. Then, containers
complained that they could not write data to the bind-mounted volumes. I
attempted to run as a non-root user, and discovered that the podman socket is
only readable by root. This should explain why all commands from that blog post
are prefixed by sudo.
Long story short, I was hoping to update the tutorial. Instead, I'm just moving
it for the sake of better organisation. For the life of me, I do not understand
why some people would want to run Podman instead of Docker. Bad documentation
is an immediate turn-off for me. From my perspective, podman is mostly an
overblown marketina stunt.</rant>
There is too much information in each of the local/k8s/dev docs pages. The
"guides" that are listed in each one of those pages are moved either to "common
tasks" or to a dedicated "tutorials" section. This paves the way for more
comprehensive tutorials, where we describe how to run the latest master
branches of Open edX.
I am well aware that, as they stand, the tutorials are of poor quality and
should be rewritten. This is a task for another day/commit. For now, we only
move the contents to a separate part of the docs.
Also, we should add a "reference" section to the docs, where we add the result
of `tutor <subcommand> --help`.
Previously, the list of domain names to which a theme was assigned had to be
specified manually. Now, the themes are automatically assigned to the LMS and
the CMS, both in development and production modes.
It should be unnecessary to build a custom openedx-dev Docker image. All tests
can run from within the dev Docker image, with a couple additional environment
variables.
The package maintainer of the "tutor" package was kind enough to
transfer ownership of the project to us. This is great, because we no
longer have to use the "openedx" suffix, which is trademarked.
For the time being, we keep maintaining the "tutor-openedx" package
which has a 1-to-1 dependency on the "tutor" package. In the future, we
expect that we will no longer push upgrades to tutor-openedx.
Here we add to the docs a few shameless plugs about Cairn -- because
it's really awesome!
We also add a few improvements to the wording, here and there.
We remove security patches and custom fixes which are now part of koa.3.
We take the opportunity to make it possible to build the openedx Docker image
without relying on a corresponding openedx-i18n repo tag: often, we want to
test whether the image simply builds successfully, and we don't need up-to-date
translations. For those cases, it's now possible to pass the `-a
OPENEDX_I18N_VERSION=oldertag` build argument.
First, allow using custom Django settings on a development
environment (as documented but not implemented), setting it to the
correct value of `tutor.development`. Prior to this, `tutor dev
runserver lms` would default to `tutor.production` when on a custom edX
branch.
Second, fix the documentation so the correct environment variable is
described, at the same time removing an option that doesn't seem to work.
See discussion: https://discuss.overhang.io/t/koa-dev-lms-doesnt-find-static-content/1250
We manage to get unit tests to run in a dedicated openedx-test container. Only
35 tests are failing (out of 17k). I suspect these tests are also failing in
the devstack.
This introduces a new dev/local command:
tutor dev bindmount CONTAINER PATH
And a new volume syntax:
tutor dev run --volume=PATH CONTAINER
This syntax automatically bind-mounts folders from the tutorroot/volumes
directory, which is pretty nifty.
- 💥[Improvement] Upgrade Open edX to Koa
- 💥 Setting changes:
- The ``ACTIVATE_HTTPS`` setting was renamed to ``ENABLE_HTTPS``.
- Other ``ACTIVATE_*`` variables were all renamed to ``RUN_*``.
- The ``WEB_PROXY`` setting was removed and ``RUN_CADDY`` was added.
- The ``NGINX_HTTPS_PORT`` setting is deprecated.
- Architectural changes:
- Use Caddy as a web proxy for automated SSL/TLS certificate generation:
- Nginx no longer listens to port 443 for https traffic
- The Caddy configuration file comes with a new ``caddyfile`` patch for much simpler SSL/TLS management.
- Configuration files for web proxies are no longer provided.
- Kubernetes deployment no longer requires setting up a custom Ingress resource or custom manager.
- Gunicorn and Whitenoise are replaced by uwsgi: this increases boostrap performance and makes it no longer necessary to mount media folders in the Nginx container.
- Replace memcached and rabbitmq by redis.
- Additional features:
- Make it possible to disable all plugins at once with ``plugins disable all``.
- Add ``tutor k8s wait`` command to wait for a pod to become ready
- Faster, more reliable static assets with local memory caching
- Deprecation: proxy files for Apache and Nginx are no longer provided out of the box.
- Removed plugin `{{ patch (...) }}` statements:
- "https-create", "k8s-ingress-rules", "k8s-ingress-tls-hosts": these are no longer necessary. Instead, declare your app in the "caddyfile" patch.
- "local-docker-compose-nginx-volumes": this patch was primarily used to serve media assets. The recommended is now to serve assets with uwsgi.
When I tried running `openedx-assets build` on my `tutor dev lms` machine, I got an error:
```
openedx@1dfe0ece7805:~/edx-platform$ openedx-assets build --env=dev
mkdir_p path('common/static/common/js/vendor')
mkdir_p path('common/static/common/css')
mkdir_p path('common/static/common/css/vendor')
Copying vendor files into static directory
Traceback (most recent call last):
File "/openedx/bin/openedx-assets", line 218, in <module>
main()
File "/openedx/bin/openedx-assets", line 89, in main
args.func(args)
File "/openedx/bin/openedx-assets", line 94, in run_build
run_npm(args)
File "/openedx/bin/openedx-assets", line 117, in run_npm
assets.process_npm_assets()
File "/openedx/edx-platform/pavelib/assets.py", line 643, in process_npm_assets
copy_vendor_library(library)
File "/openedx/edx-platform/pavelib/assets.py", line 614, in copy_vendor_library
raise Exception(u'Missing vendor file {library_path}'.format(library_path=library_path))
Exception: Missing vendor file node_modules/backbone.paginator/lib/backbone.paginator.js
```
As suggested in [this topic](https://discuss.overhang.io/t/issue-with-paver-update-assets/641) I had to run `npm install` to get the packages it tries to copy from. That makes sense, so I think it should be part of the instructions here.
Users appear to run into compatibility issues where podman-compose
does not behave exactly as docker-compose. Add a warning that those
issues exist, and that reports about them are welcome on the
Tutor Discourse.
Reference:
https://discuss.overhang.io/t/tutor-with-podman-not-working/905/2
Previously, it was not possible to override the docker registry for just
one or a few services. Setting the DOCKER_REGISTRY configuration
parameter would apply to all images. This was inconvenient. To resolve
this, we include the docker registry value in the DOCKER_IMAGE_*
configuration parameters. This allows users to override the docker
registry individually by defining the DOCKER_IMAGE_SERVICENAME
configuration parameter.
See https://discuss.overhang.io/t/kubernetes-ci-cd-pipeline/765/3
Here, we upgrade the Open edX platform from Ironwood to Juniper. This
upgrade does not come with many feature changes, but there are many
technical improvements under the hood:
- Upgrade from Python 2.7 to 3.5
- Upgrade from Mongodb v3.2 to v3.6
- Upgrade Ruby to 2.5.7
We took the opportunity to completely rething the way locally running
platforms should be accessed for testing purposes. It is no longer
possible to access a running platform from http://localhost and
http://studio.localhost. Instead, users should access
http://local.overhang.io and https://studio.local.overhang.io. This
drastically simplifies internal communication between Docker containers.
To upgrade, users should simply run:
tutor local quickstart
For Kubernetes platform, the upgrade process is outlined when running:
tutor k8s upgrade --from=ironwood
There are too many different ways to deploy an Ingress resource and to
generate SSL/TLS certificates: it's too much responsibility to make that
decision for the end user.
Running jobs was previously done with "exec". This was because it
allowed us to avoid copying too much container specification information
from the docker-compose/deployments files to the jobs files. However,
this was limiting:
- In order to run a job, the corresponding container had to be running.
This was particularly painful in Kubernetes, where containers are
crashing as long as migrations are not correctly run.
- Containers in which we need to run jobs needed to be present in the
docker-compose/deployments files. This is unnecessary, for example when
mysql is disabled, or in the case of the certbot container.
Now, we create dedicated jobs files, both for local and k8s deployment.
This introduces a little redundancy, but not too much. Note that
dependent containers are not listed in the docker-compose.jobs.yml file,
so an actual platform is still supposed to be running when we launch the
jobs.
This also introduces a subtle change: now, jobs go through the container
entrypoint prior to running. This is probably a good thing, as it will
avoid forgetting about incorrect environment variables.
In k8s, we find ourselves interacting way too much with the kubectl
utility. Parsing output from the CLI is a pain. So we need to switch to
the native kubernetes client library.
The "Certificate" objects are no longer required. As a consequence, the
"k8s-ingress-certificates" has become useless and should be removed from
plugins.
Users can now add custom translation strings to a locale folder at build
time, very much in the same way as custom themes or requirements. This
is quite convenient, although is does require quite a bit of time to
rebuild the docker images.
During an incident at npmjs.org it was extremely difficult to pull
nodejs packages -- so we made it possible to pull from a custom
registry, deployed for instance with Verdaccio (https://verdaccio.org/).
When we were changing unit titles in the CMS, the changes were taking a
long time to be reflected in the LMS. That's because the cache key that
corresponds to the course structure was not being updated. It was the
responsibility of an asynchronous LMS celery worker to update this cache
entry. However, this was impossible in most cases because tasks
triggered in the CMS were only processed by CMS workers. That is, unless
we are using a custom celery router:
https://celery.readthedocs.io/en/latest/userguide/routing.html#routers
This is what edx-platform does in the devstack: certain CMS tasks are
forwarded both to CMS and to LMS workers. This is achieved by defining
the ALTERNATE_WORKER_QUEUES="lms" django setting in the CMS.
Adding this setting to Tutor solves the problem in production. However,
in development mode Open edX runs without workers
(`CELERY_ALWAYS_EAGER=True`). This means that the course structure will
not be automatically updated when running `tutor dev` commands, which is
a shame. The alternative is to define the
"block_structure.invalidate_cache_on_publish" waffle switch. This can be
done from the UI (in /admin/waffle/switch/add/) or by running:
tutor dev run lms ./manage.py lms waffle_switch block_structure.invalidate_cache_on_publish on --create
However, this flag seems to slow down access to the LMS for the first
user who tries to access the course after it has been updated.
Close #302
There are too many patches on top of ironwood.2, and it's not practical
to pull them all one by one. We still want to build on top of a specific
version, and not a branch, so we use a dirty hack to guarantee that the
docker image is properly rebuilt by CI when we change it.
Because we are running a version of elasticsearch older than Methusalem,
the docker environment variables were not properly taken into account.
For instance, the cluster name and "mlockall" settings were incorrect,
as we could see by running:
$ tutor local run lms curl elasticsearch:9200 | grep cluster_name
...
"cluster_name" : "elasticsearch",
$ tutor local run lms curl elasticsearch:9200/_nodes/process?pretty | grep mlock
...
"mlockall" : false
See
https://discuss.overhang.io/t/elastic-container-is-not-being-removed/312/3
for discussion.
This fix also introduces a new tutor configuration setting to adjust the
elasticsearch heap size.
A prior change used the ironwood.1 tag to build the Android app in an
attempt to solve #289. Turns out that this change was unnecessary. So
here we revert to a more recent release of the Android app. Instead of
building from the master branch (which might create suprises) we build
from a fixed release tag.
The source repo and version are customisable via build arguments.
https://podman.io/ is meant to be a drop-in replacement for Docker.
Thus, with some tweaking to the installation environment, it appears
to be perfectly feasible to run Tutor in a Docker-less environment
that only has Podman and podman-compose installed.
Add installation instructions for doing just that.
By de-duplicating the code between dev.py and local.py, we are able to
support more docker-compose run/up/stop options passed from tutor. To do
so, we had to disable some features, such as automatically mounting the
edx-platform repo when the TUTOR_EDX_PLATFORM_PATH environment variable
was defined.
It makes more sense to document this command instead of adding it to the
`local` commands. If need be, in the future we should be able to re-add
it as a plugin.
This command adds a burden on the `local` and `k8s` command. It does not
make sense to provide this command out of the box, and not other
administration commands. Instead, we should better document how to run
regular `manage.py` commands from tutor.
Close #269.
The `dev` commands now rely on a different openedx-dev docker image.
This gives us multiple improvements:
- no more chown in base image
- faster chown in development
- mounted requirements volume in development
- fix static assets issues
- bundled ipdb/vim/... packages, which are convenient for development
Close #235
All existing plugins are added to the binary bundle, in their latest
version, so that users don't need to pip install tutor.
Also, the tutor MANIFEST.in file was removed to simplify the management
of package data.
Close #242.
The 0003 migration from the certificates app of the LMS requires that
the S3-like platform is correctly setup during initialisation. To solve
this issue, we introduce a pre-init hook that is run prior to the LMS
migrations.
Having an identical "ironwood" tag for all releases is not practical, in
particular for breaking changes. Thus, docker images are now pinned to
the tutor version that they were build with.
Thus, we remove the -y/--yes options, which were kind of unintuitive,
and we add instead `-i/--interactive`. The quickstart commands remain
interactive by default, but can be silenced with `-I/--non-interactive`.
Missing features:
- https certificates
- xqueue
- lms/cms workers
Moreover, we scalability issues due to the uploaded file storage in the
lms/cms. To address this issue we need to develop the MinIO plugin so
that it becomes compatible with Open edX.
Close #126#179#187
- More concise table of contents
- New intro
- Simpler make commands
- Fix a couple typos here and there
- Get rid of the default github issue template, and start using the
template created online.
The "latest" tag is a pain to maintain: it's a tag that we delete and
re-create at every release. Whenever we delete it, the binaries become
unavailable on Github until they are re-generated. Thus, from now on, we
conform to good practices (as examplified by the
github.com/docker/compose) project and distribute only pinned release.
The "nightly" tag remains, for now, as it allows us to distribute beta
features. It may disappear in the future.
Now that the correct webpack settings are loaded by the `update_assets`
command in Ironwood, we can stop relying on the `openedx-assets` script.
Actually, we could probably remove it.
Configuration values can be loaded from the system environment by adding
a "TUTOR_" prefix.
Environment values supersede values from the user configuration file, so
that we can set values from the command line with "KEY=VAL tutor config
save --silent" even when KEY is already present in the user
configuration file.
Environment is no longer generated separately for each target, but only
once the configuration is saved.
Note that the environment is automatically updated during
re-configuration, based on a "version" file stored in the environment.
USERID environment variable was no longer passed to docker image in
development mode.
We take the opportunity to improve the documentation regarding the dev
environment.
Close #177.
We had to backtrack from the latest release of the
android app, which is not compatible with the mobile api v0.5 available
in Hawthorn. This should change in ironwood.
Also, we included the correct oauth client ID in the app, which
prevented communication with the LMS.
The android app is now out of beta \o/
Close #89
Replace all make commands by a single "tutor" binary. Environment and
data are all moved to ~/.tutor/local/share/tutor. We take the
opportunity to add a web UI and revamp the documentation.
This is a complete rewrite.
Close #121.
Close #147.