In config.yml the new value FORUM_MONGO_DB_DATABASE was added with `cs_comments_service` as default value.
In docker-entrypoint.sh of forum I changed the hardcoded `cs_commecnts_service` with the new config value.
Multiple .yml files changed to handle the new config value.
lms-worker was configured to run CMS tasks instead of LMS tasks. I'm not
sure what tasks were being dismissed, and what is the actual production
impact.
Redis data was not actually persisted, because the redis configuration file was
not mounted from the right location. In order to mount redis data in a
host-mounted directory, the working directory has to be properly set.
The problem was occurring both with docker-compose and Kubernetes.
Close #404.
First, allow using custom Django settings on a development
environment (as documented but not implemented), setting it to the
correct value of `tutor.development`. Prior to this, `tutor dev
runserver lms` would default to `tutor.production` when on a custom edX
branch.
Second, fix the documentation so the correct environment variable is
described, at the same time removing an option that doesn't seem to work.
See discussion: https://discuss.overhang.io/t/koa-dev-lms-doesnt-find-static-content/1250
- 💥[Improvement] Upgrade Open edX to Koa
- 💥 Setting changes:
- The ``ACTIVATE_HTTPS`` setting was renamed to ``ENABLE_HTTPS``.
- Other ``ACTIVATE_*`` variables were all renamed to ``RUN_*``.
- The ``WEB_PROXY`` setting was removed and ``RUN_CADDY`` was added.
- The ``NGINX_HTTPS_PORT`` setting is deprecated.
- Architectural changes:
- Use Caddy as a web proxy for automated SSL/TLS certificate generation:
- Nginx no longer listens to port 443 for https traffic
- The Caddy configuration file comes with a new ``caddyfile`` patch for much simpler SSL/TLS management.
- Configuration files for web proxies are no longer provided.
- Kubernetes deployment no longer requires setting up a custom Ingress resource or custom manager.
- Gunicorn and Whitenoise are replaced by uwsgi: this increases boostrap performance and makes it no longer necessary to mount media folders in the Nginx container.
- Replace memcached and rabbitmq by redis.
- Additional features:
- Make it possible to disable all plugins at once with ``plugins disable all``.
- Add ``tutor k8s wait`` command to wait for a pod to become ready
- Faster, more reliable static assets with local memory caching
- Deprecation: proxy files for Apache and Nginx are no longer provided out of the box.
- Removed plugin `{{ patch (...) }}` statements:
- "https-create", "k8s-ingress-rules", "k8s-ingress-tls-hosts": these are no longer necessary. Instead, declare your app in the "caddyfile" patch.
- "local-docker-compose-nginx-volumes": this patch was primarily used to serve media assets. The recommended is now to serve assets with uwsgi.
Previously, it was not possible to override the docker registry for just
one or a few services. Setting the DOCKER_REGISTRY configuration
parameter would apply to all images. This was inconvenient. To resolve
this, we include the docker registry value in the DOCKER_IMAGE_*
configuration parameters. This allows users to override the docker
registry individually by defining the DOCKER_IMAGE_SERVICENAME
configuration parameter.
See https://discuss.overhang.io/t/kubernetes-ci-cd-pipeline/765/3
In production, the ALTERNATE_WORKER_QUEUES setting is overridden by ""
(empty string). This might prevent LMS from sending tasks to the CMS. We
have not seen this issue emerge yet, but better be safe than sorry.
We must be careful not to process the tasks from the CMS, just like for
the CMS worker which does not process the tasks from the LMS.
Half of the tasks from edx.lms.core.default celery queue were being
processed by the CMS worker. Unfortunately, this CMS worker crashes on
some of those tasks. For instance, activation emails complain of a
missing "django_markup" template tag library because "xss_utils" is not
part of the installed app in the CMS.
The problem is that we need this edx.lms.core.default queue to be part
of the CELERY_QUEUES in the cms in order to send tasks from the CMS to
the LMS. The trick to resolve this situation is to ask the CMS celery
worker to not process the tasks from this queue.
To debug this issue, run in the LMS:
from student.tasks import send_activation_email
send_activation_email("{}")
Then watch the logs of the lms and cms workers. If the CMS workers picks
up this task (50% of the time prior to this change) then we have an
issue.
See:
https://discuss.overhang.io/t/reset-password-email-sent-but-activation-email-dont/690
Here, we upgrade the Open edX platform from Ironwood to Juniper. This
upgrade does not come with many feature changes, but there are many
technical improvements under the hood:
- Upgrade from Python 2.7 to 3.5
- Upgrade from Mongodb v3.2 to v3.6
- Upgrade Ruby to 2.5.7
We took the opportunity to completely rething the way locally running
platforms should be accessed for testing purposes. It is no longer
possible to access a running platform from http://localhost and
http://studio.localhost. Instead, users should access
http://local.overhang.io and https://studio.local.overhang.io. This
drastically simplifies internal communication between Docker containers.
To upgrade, users should simply run:
tutor local quickstart
For Kubernetes platform, the upgrade process is outlined when running:
tutor k8s upgrade --from=ironwood
Running jobs was previously done with "exec". This was because it
allowed us to avoid copying too much container specification information
from the docker-compose/deployments files to the jobs files. However,
this was limiting:
- In order to run a job, the corresponding container had to be running.
This was particularly painful in Kubernetes, where containers are
crashing as long as migrations are not correctly run.
- Containers in which we need to run jobs needed to be present in the
docker-compose/deployments files. This is unnecessary, for example when
mysql is disabled, or in the case of the certbot container.
Now, we create dedicated jobs files, both for local and k8s deployment.
This introduces a little redundancy, but not too much. Note that
dependent containers are not listed in the docker-compose.jobs.yml file,
so an actual platform is still supposed to be running when we launch the
jobs.
This also introduces a subtle change: now, jobs go through the container
entrypoint prior to running. This is probably a good thing, as it will
avoid forgetting about incorrect environment variables.
In k8s, we find ourselves interacting way too much with the kubectl
utility. Parsing output from the CLI is a pain. So we need to switch to
the native kubernetes client library.
Because we are running a version of elasticsearch older than Methusalem,
the docker environment variables were not properly taken into account.
For instance, the cluster name and "mlockall" settings were incorrect,
as we could see by running:
$ tutor local run lms curl elasticsearch:9200 | grep cluster_name
...
"cluster_name" : "elasticsearch",
$ tutor local run lms curl elasticsearch:9200/_nodes/process?pretty | grep mlock
...
"mlockall" : false
See
https://discuss.overhang.io/t/elastic-container-is-not-being-removed/312/3
for discussion.
This fix also introduces a new tutor configuration setting to adjust the
elasticsearch heap size.
This has an impact on plugin hooks. Plugin hooks that needed to run
inside mysql-client now need to run inside mysql container. This
simplifies the deployment, as we no longer have an empty mysql-client
container sitting around.
When mysql is not enabled (ACTIVATE_MYSQL=False) the mysql container is
simply a mysql client.
Video transcripts uploaded in the CMS were not visible in the LMS. This
was a symptom caused by the fact that the LMS and the CMS do not share
the same MEDIA_ROOT. We initially thought that data uploaded in the CMS
(such as transcripts) was stored in a shared data service, such as
mongodb. It is, in fact, not. This makes it even more important to run
an object storage service like minio for distributed services.
Close #229
This commit introduces many changes:
- a fully functional minio plugin for local installation
- an almost-functional native k8s deployment
- a new way to process configuration, better suited to plugins
There are still many things to do:
- get rid of all the TODOs
- get a fully functional minio plugin for k8s
- add documentation for pluginso
- ...
This is especially useful on Kubernetes. With this change, the forum
container no longer crashes whenever mongodb or elasticsearch are
unavailable. Instead, it just waits for thoses services to be up.
Previously, we could not run forum migrations in Kubernetes because they
relied on exec-ing the migration command in the running container -- and
there was no such container, because the services where not already up.
For initializing the mysql database, we now run the mysql client from
the mysql container. That means that the mysql client is no longer
required in the openedx container, as previously suggested in PR #209.
Tutor is not using any bash-specific feature.
This commit changes (almost) all invocations to /bin/bash replacing them with
/bin/sh. It allows building a docker image that does not bundle bash in it.
The file k8s.py still has a bash invocation, to make sure this change does not
impact functionality for any user.
In the future, we want to allow users to rely on third-party services
for data storage, such as hosted MySQL and such. To do so, we need to be
able to configure the host/port of these services, which we do here.
This is to address part of #114.
Replace all make commands by a single "tutor" binary. Environment and
data are all moved to ~/.tutor/local/share/tutor. We take the
opportunity to add a web UI and revamp the documentation.
This is a complete rewrite.
Close #121.
Close #147.