Users appear to run into compatibility issues where podman-compose
does not behave exactly as docker-compose. Add a warning that those
issues exist, and that reports about them are welcome on the
Tutor Discourse.
Reference:
https://discuss.overhang.io/t/tutor-with-podman-not-working/905/2
Previously, it was not possible to override the docker registry for just
one or a few services. Setting the DOCKER_REGISTRY configuration
parameter would apply to all images. This was inconvenient. To resolve
this, we include the docker registry value in the DOCKER_IMAGE_*
configuration parameters. This allows users to override the docker
registry individually by defining the DOCKER_IMAGE_SERVICENAME
configuration parameter.
See https://discuss.overhang.io/t/kubernetes-ci-cd-pipeline/765/3
Here, we upgrade the Open edX platform from Ironwood to Juniper. This
upgrade does not come with many feature changes, but there are many
technical improvements under the hood:
- Upgrade from Python 2.7 to 3.5
- Upgrade from Mongodb v3.2 to v3.6
- Upgrade Ruby to 2.5.7
We took the opportunity to completely rething the way locally running
platforms should be accessed for testing purposes. It is no longer
possible to access a running platform from http://localhost and
http://studio.localhost. Instead, users should access
http://local.overhang.io and https://studio.local.overhang.io. This
drastically simplifies internal communication between Docker containers.
To upgrade, users should simply run:
tutor local quickstart
For Kubernetes platform, the upgrade process is outlined when running:
tutor k8s upgrade --from=ironwood
There are too many different ways to deploy an Ingress resource and to
generate SSL/TLS certificates: it's too much responsibility to make that
decision for the end user.
Running jobs was previously done with "exec". This was because it
allowed us to avoid copying too much container specification information
from the docker-compose/deployments files to the jobs files. However,
this was limiting:
- In order to run a job, the corresponding container had to be running.
This was particularly painful in Kubernetes, where containers are
crashing as long as migrations are not correctly run.
- Containers in which we need to run jobs needed to be present in the
docker-compose/deployments files. This is unnecessary, for example when
mysql is disabled, or in the case of the certbot container.
Now, we create dedicated jobs files, both for local and k8s deployment.
This introduces a little redundancy, but not too much. Note that
dependent containers are not listed in the docker-compose.jobs.yml file,
so an actual platform is still supposed to be running when we launch the
jobs.
This also introduces a subtle change: now, jobs go through the container
entrypoint prior to running. This is probably a good thing, as it will
avoid forgetting about incorrect environment variables.
In k8s, we find ourselves interacting way too much with the kubectl
utility. Parsing output from the CLI is a pain. So we need to switch to
the native kubernetes client library.
The "Certificate" objects are no longer required. As a consequence, the
"k8s-ingress-certificates" has become useless and should be removed from
plugins.
Users can now add custom translation strings to a locale folder at build
time, very much in the same way as custom themes or requirements. This
is quite convenient, although is does require quite a bit of time to
rebuild the docker images.
During an incident at npmjs.org it was extremely difficult to pull
nodejs packages -- so we made it possible to pull from a custom
registry, deployed for instance with Verdaccio (https://verdaccio.org/).
When we were changing unit titles in the CMS, the changes were taking a
long time to be reflected in the LMS. That's because the cache key that
corresponds to the course structure was not being updated. It was the
responsibility of an asynchronous LMS celery worker to update this cache
entry. However, this was impossible in most cases because tasks
triggered in the CMS were only processed by CMS workers. That is, unless
we are using a custom celery router:
https://celery.readthedocs.io/en/latest/userguide/routing.html#routers
This is what edx-platform does in the devstack: certain CMS tasks are
forwarded both to CMS and to LMS workers. This is achieved by defining
the ALTERNATE_WORKER_QUEUES="lms" django setting in the CMS.
Adding this setting to Tutor solves the problem in production. However,
in development mode Open edX runs without workers
(`CELERY_ALWAYS_EAGER=True`). This means that the course structure will
not be automatically updated when running `tutor dev` commands, which is
a shame. The alternative is to define the
"block_structure.invalidate_cache_on_publish" waffle switch. This can be
done from the UI (in /admin/waffle/switch/add/) or by running:
tutor dev run lms ./manage.py lms waffle_switch block_structure.invalidate_cache_on_publish on --create
However, this flag seems to slow down access to the LMS for the first
user who tries to access the course after it has been updated.
Close #302
There are too many patches on top of ironwood.2, and it's not practical
to pull them all one by one. We still want to build on top of a specific
version, and not a branch, so we use a dirty hack to guarantee that the
docker image is properly rebuilt by CI when we change it.
Because we are running a version of elasticsearch older than Methusalem,
the docker environment variables were not properly taken into account.
For instance, the cluster name and "mlockall" settings were incorrect,
as we could see by running:
$ tutor local run lms curl elasticsearch:9200 | grep cluster_name
...
"cluster_name" : "elasticsearch",
$ tutor local run lms curl elasticsearch:9200/_nodes/process?pretty | grep mlock
...
"mlockall" : false
See
https://discuss.overhang.io/t/elastic-container-is-not-being-removed/312/3
for discussion.
This fix also introduces a new tutor configuration setting to adjust the
elasticsearch heap size.
A prior change used the ironwood.1 tag to build the Android app in an
attempt to solve #289. Turns out that this change was unnecessary. So
here we revert to a more recent release of the Android app. Instead of
building from the master branch (which might create suprises) we build
from a fixed release tag.
The source repo and version are customisable via build arguments.
https://podman.io/ is meant to be a drop-in replacement for Docker.
Thus, with some tweaking to the installation environment, it appears
to be perfectly feasible to run Tutor in a Docker-less environment
that only has Podman and podman-compose installed.
Add installation instructions for doing just that.
By de-duplicating the code between dev.py and local.py, we are able to
support more docker-compose run/up/stop options passed from tutor. To do
so, we had to disable some features, such as automatically mounting the
edx-platform repo when the TUTOR_EDX_PLATFORM_PATH environment variable
was defined.
It makes more sense to document this command instead of adding it to the
`local` commands. If need be, in the future we should be able to re-add
it as a plugin.
This command adds a burden on the `local` and `k8s` command. It does not
make sense to provide this command out of the box, and not other
administration commands. Instead, we should better document how to run
regular `manage.py` commands from tutor.
Close #269.
The `dev` commands now rely on a different openedx-dev docker image.
This gives us multiple improvements:
- no more chown in base image
- faster chown in development
- mounted requirements volume in development
- fix static assets issues
- bundled ipdb/vim/... packages, which are convenient for development
Close #235