News/2022 Toolforge Stretch deprecation - Wikitech
Jump to content
From Wikitech
News
(Redirected from
News/Toolforge Stretch deprecation
Toolforge
Cloud Services overview
Toolforge user docs
Toolforge changelog
Get started
Quickstart: set up and get access
How Toolforge works
Rules you must follow
Tutorials
Build and run tools
Navigate tool accounts and files
Build container images for tools
Run a web service
Schedule and manage jobs
Manage tool runtime configuration (envvars)
Deploy your tool on every push (beta)
Language-specific details:
Python
Pywikibot
Node.js
PHP
...more languages/frameworks
Use Redis for caching
Index content with Elasticsearch
Access shared storage and databases
Access shared storage and public wiki dumps
Access the Wiki Replicas databases
Access replica search indices
Manage
tool databases
Sending and receiving email
as tools
Share and maintain tools
Set up version control and code review
Develop successful tools
Find and share tools on Toolhub
Delete a tool
Get help
How and where to get help
Troubleshooting
Contribute to Toolforge
Useful links
Toolforge admin docs
List of tools
Toolforge Admin Console (toolsadmin)
Toolforge API
edit
This page details information about deprecating and removing hosts running
Debian Stretch
(9.x) as an operating system from the
Toolforge
infrastructure. The login bastions and
Grid
execution hosts are still running Stretch and must be replaced with new instances.
What is changing?
New bastion hosts running Debian Buster with connectivity to the new job grid
New versions of PHP, Python3, and other language runtimes
New versions of various support libraries
We are introducing a configuration option to let you select which operating system (Debian Stretch or Debian Buster) you want for your grid-based tool. That way, you can try out the migration from Stretch to Buster at your convenience.
Timeline
2022-02-15:
Done
(email)
Availability of Debian Buster grid announced to community
2022-03-21: Reminders via email to tool maintainers for tools still running on Stretch
Done
Week of 2022-04-21:
Daily reminders via email to tool maintainers for tools still running on Stretch
Done
Switch
login.toolforge.org
to point to Buster bastion
Done
Week of 2022-05-02: Evaluate migration status and formulate plan for final shutdown of Stretch grid
2022-06-02: Make Buster the default if no release was specified
Done
(email)
2022-06-??: Shutdown Stretch grid
What should I do?
You should migrate your Toolforge tool to a newer environment.
You have two options:
migrate from Toolforge Stretch Grid Engine to Toolforge
Kubernetes
migrate from Toolforge Stretch Grid Engine to Toolforge Buster Grid Engine.
SSH to the bastions
During the compatibility period, there are 2 sets of bastions available:
login.toolforge.org
: points to the old Debian Stretch bastion
dev.toolforge.org
: points to the old Debian Stretch development bastion
login-buster.toolforge.org
: points to the new Debian Buster bastion
dev-buster.toolforge.org
: points to the new Debian Buster development bastion
When the time arrives, the old Stretch bastion will stop working, and both
login.toolforge.org
and
dev.toolforge.org
will point to Buster bastions.
Move from Grid Engine to Kubernetes
Our recommendation is that you move all your tools from the Grid Engine backend into Kubernetes.
If your tool consists of one or more jobs, there may be direct equivalences between the commands. Check the dedicated documents:
Help:Toolforge/Jobs_framework#Grid_Engine_migration
If your tool consists of a webservice,
check here instead
Move a grid engine webservice
We strongly encourage you to migrate web services to
Kubernetes
instead of using the grid.
If you have strong reasons to keep using the grid for webservices, then try the
--release
{buster|stretch}
parameter:
:# Connect to the Buster bastion
ssh
:# Become your tool account
become
YOUR_TOOL
:# stop the webservice
webservice
stop
:# Start the webservice as a Kubernetes container rather than a grid job
:#
webservice
--backend
kubernetes
start
:# -- OR --
:# Start the webservice as a Buster grid job
:#
webservice
--backend
gridengine
--release
buster
start
See
Help:Toolforge/Web#Choosing_a_backend
for more information on migrating from grid engine to Kubernetes.
Python2 and Python3 webservices will need to
rebuild their virtualenv environments
on the new target runtime (Buster grid or Kubernetes).
NodeJS webservices will need to rebuild their $HOME/www/js/node_modules on the new target runtime (Buster grid or Kubernetes).
Move a continuous job
Try first migrating from Grid Engine to the
new Toolforge jobs framework on Kubernetes
If you cannot move today, we would be interested to learn what prevents you from doing so.
If you decide to keep using the grid, simply run:
:# Connect to the Stretch bastion
ssh
:# Become your tool account
become
YOUR_TOOL
:# Start your job on the Buster job grid(note: this is a specific example for a php job that checks for a quota).
jstart
-release
buster
-mem
350m
php
check_my_quota.php
The exact commands needed to start each continuous job vary greatly from tool to tool. This would be a great time to make a page of reference material for yourself and other maintainers here on Wikitech in the
Tool
namespace and using the
Tool template
if you haven't already.
Move a cron job
Try first migrating from Grid Engine to the
new Toolforge jobs framework on Kubernetes
If you cannot move today, we would be interested to learn what prevents you from doing so.
If you decide to keep using the grid, to migrate your cron job provide a
-release buster
jsub argument when creating them with the
crontab
command.
The grid cron job server is Debian Stretch and will remain that way until the end of the migration, meaning that if you don't specify a
-release buster
option, the default (stretch) will be used.
See
Executable paths with jsub
for other potential problems with cron jobs.
If your workload permits, please avoid scheduling cronjobs from midnight to 3am so you're not competing with other cronjobs for system resources. That time window is currently very crowded.
What are the primary changes with moving to Buster?
Language runtime and library versions
The vast majority of the language runtimes and libraries installed on the grid nodes are upgraded in BUSTER.
Runtime
Stretch Version
Buster Version
Python3
3.5.3
3.7.3
PHP
7.2
7.3
Python2
2.7.13
2.7.16
NodeJS
8.11.1
10.24.0
Perl
5.24
5.28
Java
11.0.6
11.0.9
Ruby
2.3.3
2.5.5
Mono
5.12.0
5.18.0
TCL
8.6.0
8.6.9
3.3.3
3.5.2
Solutions to common problems
Having trouble with the new grid? If the answer to your problem isn't here, ask for help in
#wikimedia-cloud
connect
or
file a task in Phabricator using this template
Executable paths with jsub
Some system executables have changed their path between Debian releases. When scheduling jobs, the
jsub
command resolves the full path for executable files, which means that a given command may fail if the destination grid is not the same Debian release as the server in which
jsub
is run.
This happens in particular when running
jsub
from a Debian Buster bastion host to schedule jobs in the Debian Stretch grid.
The solutions to these problem are simple in most cases:
use a wrapper script as entry point for your jobs.
schedule a job for a given grid release from a matching Debian release bastion.
if possible, use explicit full path when scheduling your
jsub
jobs.
Example of a problematic
jsub
operation, in a Debian Buster bastion:
tool.mytool@tools-sgebastion-10:~ $
jsub
-N
myjob-stretch
-release
stretch
echo
hi
Your job 3595 ("myjob-stretch") has been submitted
tool.mytool@tools-sgebastion-10:~ $
jsub
-N
myjob-buster
-release
buster
echo
hi
Your job 3596 ("myjob-buster") has been submitted
tool.mytool@tools-sgebastion-10:~ $
cat
myjob-stretch.*
-bash: /usr/bin/echo: No such file or directory
tool.mytool@tools-sgebastion-10:~ $
cat
myjob-buster.*
hi
Example of a good
jsub
operation, in a Debian Buster bastion:
tool.mytool@tools-sgebastion-10:~ $
jsub
-N
mywrappedjob-stretch
-release
stretch
./my-wrapper.sh
Your job 3596 ("mywrappedjob-stretch") has been submitted
tool.mytool@tools-sgebastion-10:~ $
jsub
-N
myjob-stretch
-release
stretch
/bin/echo
hi
Your job 3597 ("myjob-stretch") has been submitted
tool.mytool@tools-sgebastion-10:~ $
jsub
-N
myjob-buster
-release
buster
/bin/echo
hi
Your job 3598 ("myjob-buster") has been submitted
tool.mytool@tools-sgebastion-10:~ $
cat
mywrappedjob-stretch.*
hi
tool.mytool@tools-sgebastion-10:~ $
cat
myjob-stretch.*
hi
tool.mytool@tools-sgebastion-10:~ $
cat
myjob-buster.*
hi
In case of cron jobs, the problem is the same, with the particularity that our cron server is Debian Stretch, and will stay that way until the end of the migration, when we will introduce a Debian Buster cron server replacement.
Example of a reasonable crontab file (uses either a wrapper or full path):
tool.mytool@tools-sgebastion-10:~ $
crontab
-l
10 * * * * /usr/bin/jsub -N cron-10 -once -quiet ./mywrappedjob.sh
12 * * * * /usr/bin/jsub -N cron-11 -once -quiet /bin/echo hello
Example of a bad crontab file (no wrapper, missing full path):
tool.mytool@tools-sgebastion-10:~ $
crontab
-l
10 * * * * /usr/bin/jsub -N cron-10 -once -quiet echo hello
Rebuild virtualenv for python users
Since the python executables and libraries are updated in Debian Buster, local virtualenvs will need to be deleted and re-created on the new bastion for anything that runs from those virtualenvs to work. Several errors are likely to be caused by old virtualenvs with one obvious one being an unexpected
ImportError
Using a requirements file may make this simpler in many cases, if your project doesn't already use one. You can create one in your local directory by running
pip freeze > requirements.txt
in your tool folder with your virtualenv activated. Then later on, you can simply use
pip install -r requirements.txt
to install the new environment after you deleted the old virtualenv and created a new one. For more information on this option, see
pip's documentation on requirements files
Example 1: Upgrading a Stretch grid engine based tool to the Buster grid
Follow these steps if you manually submit jobs using jsub, or if you submit jobs using a crontab.
ssh
become
YOUR_TOOL
rm
-rf
venv
# This will destroy the virtualenv and all libraries, so make sure you know what you will need to install later!
python3
-m
venv
venv
source
venv/bin/activate
pip3
install
--upgrade
pip
# upgrade pip itself to avoid problems with older versions
pip3
install
...
# Here you'd use the requirements file syntax if you have one, or you'd manually install each needed library.
Example 2: Upgrading a uWSGI webservice into a Kubernetes container
If you are currently running your uWSGI webservice under the Grid Engine backend (i.e.,
webservice uwsgi-python
command
), and you want to upgrade to a uWSGI webservice running under Kubernetes (i.e.,
webservice --backend=kubernetes python
command)
, you should rebuild your virtualenv as follows:
ssh
become
YOUR-TOOL
webservice
--backend
kubernetes
python
stop
webservice
--backend
kubernetes
python
shell
# do not skip this step – setting up the venv directly from the bastion may result in serious performance issues, compare T214086
rm
-rf
www/python/venv/
# this will destroy the virtualenv and all libraries, so make sure you know what you will need to install later!
python3
-m
venv
www/python/venv/
source
www/python/venv/bin/activate
pip3
install
--upgrade
pip
# upgrade pip itself to avoid problems with older versions
pip3
install
-r
www/python/src/requirements.txt
# assuming your tool has a requirements.txt file
webservice
--backend
kubernetes
python
start
Example 3: Upgrading a Kubernetes uWSGI webservice
If you are already using the Kubernetes backend, there is nothing you need to do -- the container will use the same image as before.
Delete a tool
Tracked in
Phabricator
Task T170355
Resolved
Some tools were experiments that are done, others were made obsolete by other tools, some are just things that the original maintainer is tired of caring for. Maintainers can mark their tools for deletion using the "Disable tool" button on the tool's detail page on
. Disabling a tool will immediately stop any running jobs including webservices and prevent maintainers from logging in as the tool. Disabled tools are archived and deleted after 40 days. Disabled tools can be re-enabled at any time prior to being archived and deleted.
SSH to login-buster.toolforge.org fails with 'Permission denied (publickey)'
Tracked in
Phabricator
Task T168433
This is typically an issue with the newer Debian Buster provided version of
sshd
on the server side refusing to authenticate an insecure or deprecated public key type. Specifically, support for DSA (ssh-dss) keys was deprecated in Openssh 7.0. If your ssh public key starts with the string "ssh-dss" you will be impacted by this. RSA keys smaller than 1024 bits are also deprecated.
First make sure that you are passing a valid key by attempting to ssh to
login.toolforge.org
using the same public key and username. If this also fails, the problem is likely something other than the ssh key type. Join us in
#wikimedia-cloud
connect
for interactive debugging help.
If you can ssh to
login.toolforge.org
with no errors, your key is probably of an unsupported type. Generate a new ssh key pair and upload the public key using the form at
. We currently recommend using either
ed25519
or
4096-bit RSA
keys. See
Production shell access#Generating your SSH key
for more information.
SSH to login-buster.toolforge.org fails with 'Permission denied (publickey,hostbased)'
In case you face this problem, make sure to use the right shell name located on your
User Preferences
called **Instance shell account name**. It's supposed to be used in logging into the Toolforge server when need be, whether Stretch or Buster.
How to determine where your job is running
If you want to know whether your job is running on the old Stretch grid or the new Buster grid, use the
qstat
command with the
-xml
flag.
tool.mytool@tools-sgebastion-10:~ $
qstat
-xml
The hostname component of the
that the job is running under tells you if the queue is running on Stretch (contains -09nn where nn is any two digits) or Buster (contains -10- )
Monitoring tools
Tools running jobs on Stretch grid engine in last 7 days
This report updates once per hour and will not report jobs that have been seen running on the Buster grid in the same 7 day period.
Report has drill down pages for each maintainer and tool. Examples:
bd808's tools
sge-status tool
Why are we doing this?
This is an implementation of our Operating System Upgrade Policy.
In a nutshell, we use Debian and deprecate versions three years after release and remove them completely from our infrastructure by four years after their release.
Debian Stretch was released in June 2017, and long term support for it (including security updates) will cease in June 2022. We need to shut down all Stretch hosts before the end of support date to ensure that Toolforge remains a secure platform. This migration will take several months because many people still use the Stretch hosts and our users are working on tools in their spare time.
See
Operating System Upgrade Policy
for more information.
See also
[Cloud-announce] Use of Debian Stretch now discouraged
[Cloud-announce] [IMPORTANT] Announcing Toolforge Debian Stretch Grid Engine deprecation
Wikimedia Techblog: Toolforge and Grid Engine
Wikimedia Techblog: Toolforge GridEngine Debian 10 Buster migration
Wikimedia Techblog: Toolforge Jobs Framework
Communication and support
Support and administration of the WMCS resources is provided by the
Wikimedia Foundation Cloud Services team
and
Wikimedia movement volunteers
. Please reach out with questions and join the conversation:
Discuss and receive general support
Chat in real time in the
IRC channel
#wikimedia-cloud
connect
or the bridged
Telegram group
Discuss via email after you have subscribed to the
cloud@
mailing list
Stay aware of critical changes and plans
Subscribe to the
cloud-announce@
mailing list
(all messages are also mirrored to the
cloud@
list)
Read the
News
wiki page
Track work tasks and
report bugs
Use a subproject of the
#Cloud-Services
Phabricator
project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself
Read stories and WMCS blog posts
Read the
Cloud Services Blog
(for the broader Wikimedia movement, see the
Wikimedia Technical Blog
Retrieved from "
Category
Toolforge
News/2022 Toolforge Stretch deprecation
Add topic
US