Launching tasks

uData provides a command line interface for most of the administrative tasks.

You can get the documentation related to all tasks with:

$ udata -?

And then get the documentation for subtasks:

$ udata user -?

Diagnostic

If you have some issues, start with:

$ udata info

This will display some useful details about your local configuration.

Managing users

You can create a user with:

$ udata user create

You can also give a user administrative privileges with:

$ udata user set_admin <email>

Purge data flagged as deleted

When users delete some data in udata, it’s only flagged as deleted and hidden in the frontend. This allows the administrative team to undelete data in case of error. To remove the data flagged as deleted once and for all, you need to purge them by either launching the appropriate jobs or by executing the purge command.

$ udata purge
-> Purging datasets
-> Purging reuses
-> Purging organizations

Sometimes you need to purge only a given type of data. You can use the appropriate flags to do so:

# purge only datasets
$ udata purge --datasets
-> Purging datasets
# purge only reuses
$ udata purge --reuses
-> Purging reuses
# purge only organizations
$ udata purge --organizations
-> Purging organizations

Warning: these operations are permanents and irreversibles

Note: Users can’t be fully purged because of the content they submitted which can’t be orphaned. This is why they are only anonymised.

Manage jobs

Jobs are adminstrative tasks that can be run asynchronously on a worker or synchronously through the shell.

You can list available jobs with:

$ udata job list
-> log-test
-> purge-organizations
-> purge-datasets
-> bump-metrics
-> purge-reuses
-> error-test
-> harvest
-> send-frequency-reminder
-> crawl-resources
-> count-tags

You can launch a job with:

# Run a job synchronously
$ udata job run job-name
# Run a job asynchronously (needs workers)
$ udata job run -d job-name

Some jobs require arguments and keywords arguments. You can pass them with the -a for arguments and -k for keyword arguments:

$ udata job run job-name -a arg1 arg2 -k key1=value key2=value

Note: this is a low level command. Most of the time, you won’t need it because there will be a dedicated command to perform the task you need.

Reindexing data

Sometimes, you need to reindex data (in case of model breaking changes, workers defect…). You can use the udata search index command to do so.

This command supports both full reindex without arguments and partial with model names as arguments:

# Reindex everything
udata search index
# Only reindex reuses and organizations
udata search index reuses organizations

By default the command deletes the previous index in case of success or the new unfinished index in case of error but you can ask to keep indexes with the -k/--keep parameter

# Reindex everything but keep the old index
udata search index -k

When used from an interactive terminal the command also prompt for deletion confirmation if an index with the same name already exists. This can be bypassed with the -f/--force parameter.

# Reindex everything and delete old index
udata search index -f

It’s possible to do a partial reindex by providing models (both singular and plural are supported) as arguments:

# Only reindex datasets and reuses (plural form)
udata search index datasets reuses
# Only reindex datasets and reuses (singular form)
udata search index dataset reuse

Workers

Start a worker with:

$ udata worker start

See all waiting Celery tasks across all workers:

$ udata worker status

Display waiting tasks in a Munin plugin compatible format (you can use the provided Munin plugin):

$ udata worker status --munin -q default
$ udata worker status --munin-config -q default