Adapting settings¶

You can specify a configuration file by exporting the UDATA_SETTINGS environment variable.

export UDATA_SETTINGS=/path/to/my/udata.cfg

The configuration file is simply a Python file declaring variables.

udata uses a few Flask extensions and therefore provides all available options for these extensions. Most of the time, it tries to provide sane defaults.

Flask and global behavior options¶

DEBUG¶

default: False

A boolean specifying the debug mode.

SEND_MAIL¶

default: True

A boolean specifying if the emails should actually be sent.

DEFAULT_LANGUAGE¶

default: 'en'

The default fallback language when no language prefix is provided in URLs.

SECRET_KEY¶

A secret key used as salt for cryptographic parts. You must specify your own secure key and use the same in all your instances.

SITE_ID¶

default: 'default'

The site identifier. It is used to attached some database configuration, metrics…

SITE_TERMS_LOCATION¶

default: generic embedded terms

The site terms in markdown. It can be either an URL or a local path to a markdown content. If this is an URL, the content is downloaded on the first terms page display and cached.

PLUGINS¶

default: []

A list of enabled udata plugins.

THEME¶

default: 'default'

The enabled theme name.

TEMPLATE_CACHE_DURATION¶

default: 5

The duration used for templates’ cache, in minutes.

ALLOWED_RESOURCES_EXTENSIONS¶

default:

[
    # Base
    'csv', 'txt', 'json', 'pdf', 'xml', 'rdf', 'rtf', 'xsd',
    # OpenOffice
    'ods', 'odt', 'odp', 'odg',
    # Microsoft Office
    'xls', 'xlsx', 'doc', 'docx', 'pps', 'ppt',
    # Archives
    'tar', 'gz', 'tgz', 'rar', 'zip', '7z', 'xz', 'bz2',
    # Images
    'jpeg', 'jpg', 'jpe', 'gif', 'png', 'dwg', 'svg', 'tiff', 'ecw', 'svgz', 'jp2',
    # Geo
    'shp', 'kml', 'kmz', 'gpx', 'shx', 'ovr', 'geojson',
    # Meteorology
    'grib2',
    # Misc
    'dbf', 'prj', 'sql', 'epub', 'sbn', 'sbx', 'cpg', 'lyr', 'owl',
]

This is the allowed resources extensions list that user can upload.

RESOURCES_FILE_ALLOWED_DOMAINS¶

default: []

Whitelist of urls domains allowed for resources with filetype equals to file.

SERVER_NAME is always included.

* is a supported value as a wildcard allowing all domains.

PREVIEW_MODE¶

default: 'iframe'

Define the resources preview mode. Can be one of: - 'iframe': preview are displayed into an iframe modal - 'page': preview is displayed into a new page

If you want to disable preview, set PREVIEW_MODE to None

ARCHIVE_COMMENT_USER_ID¶

default: None

The id of an existing user which will post a comment when a dataset is archived.

ARCHIVE_COMMENT_TITLE¶

default: _('This dataset has been archived')

The title of the comment optionaly posted when a dataset is archived. NB: the content of the comment is located in udata/templates/comments/dataset_archived.txt.

SCHEMA_CATALOG_URL¶

default : None

The URL to a schema catalog, listing schemas resources can conform to. The URL should be a JSON endpoint, returning a schema catalog. Example: https://schema.data.gouv.fr/schemas/schemas.json

NB: this is used by the datasets/schemas API to fill the schema field of a Resource.

URLs validation¶

URLS_ALLOW_PRIVATE¶

default: False

Whether or not to allow private URLs (private IPs…) submission

URLS_ALLOW_LOCAL¶

default: False

Whether or not to allow local URLs (localhost…) submission. When developping you might need to set this to True.

URLS_ALLOW_CREDENTIALS¶

default: True

Whether or not to allow credentials in URLs submission.

URLS_ALLOWED_SCHEMES¶

default: ('http', 'https', 'ftp', 'ftps')

List of allowed URL schemes.

URLS_ALLOWED_TLDS¶

default: All IANA registered TLDs

List of allowed TLDs. When using udata on an intranet, you might want to add your own custom TLDs:

from udata.settings import Defaults

URLS_ALLOWED_TLDS = Defaults.URLS_ALLOWED_TLDS + set(['custom', 'company'])

EXPORT_CSV_MODELS¶

default: ('dataset', 'resource', 'discussion', 'organization', 'reuse', 'tag')

List models that will be exported to CSV by the job export-csv. You can disable the feature by setting this to an empty list.

EXPORT_CSV_DATASET_ID¶

default: None

The id of a dataset that should be created before running the export-csv job and will hold the CSV exports.

Search configuration¶

SEARCH_AUTOCOMPLETE_ENABLED¶

default: True

Enables the search autocomplete on frontend if set to True, disables otherwise.

SEARCH_AUTOCOMPLETE_DEBOUNCE¶

default: 200

SEARCH_SERVICE_API_URL¶

default: None

The independent search service api url to use if available. If not specified, mongo full text search is used.

Ex:

SEARCH_SERVICE_API_URL = 'http://127.0.0.1:5000/api/1/'

See udata-search-service for more information on using a search service.

Spatial configuration¶

SPATIAL_SEARCH_EXCLUDE_LEVELS¶

default: tuple()

List spatial levels that shoudn’t be indexed (for time, performance and user experience).

Territories configuration¶

ACTIVATE_TERRITORIES¶

default: False

Whether you want to activate pages and API related to territories. Don’t forget to set the HANDLED_LEVELS setting too.

HANDLED_LEVELS¶

default: tuple()

The list of levels that you want to deal with.

Warning: the order is important and will determine parents/children for a given territory. You have to set the smallest territory level first:

HANDLED_LEVELS = ('fr:commune', 'fr:departement', 'fr:region')

Harvesting configuration¶

HARVEST_PREVIEW_MAX_ITEMS¶

default: 20

The number of items to fetch while previewing an harvest source

HARVEST_MAX_ITEMS¶

default: None

The max number of items to fetch when harvesting (development setting)

HARVEST_DEFAULT_SCHEDULE¶

default: 0 0 * * *

A cron expression used as default harvester schedule when validating harvesters.

HARVEST_JOBS_RETENTION_DAYS¶

default: 365

The number of days of harvest jobs to keep (ie. number of days of history kept)

Link checker configuration¶

LINKCHECKING_ENABLED¶

default: True

A flag to enable the resources urls check by an external link checker.

LINKCHECKING_DEFAULT_LINKCHECKER¶

default: no_check

An entrypoint key of udata.linkcheckers that will be used as a default link checker, i.e. when no specific link checker is set for a resource (via resource.extras.check:checker).

LINKCHECKING_IGNORE_DOMAINS¶

default: []

A list of domains to ignore when triggering link checking of resources urls.

LINKCHECKING_IGNORE_PATTERNS¶

default: [‘format=shp’]

A list patterns found in checked URL to ignore (ie pattern in url).

LINKCHECKING_MIN_CACHE_DURATION¶

default: 60

The minimum time in minutes between two consecutive checks of a resource’s url.

LINKCHECKING_MAX_CACHE_DURATION¶

default: 1080

The maximum time in minutes between two consecutive checks of a resource’s url.

LINKCHECKING_UNAVAILABLE_THRESHOLD¶

default: 100

The number of unavailable checks after which the resource is considered lastingly unavailable and won’t be checked as often.

Mongoengine/Flask-Mongoengine options¶

MONGODB_HOST¶

default: 'mongodb://localhost:27017/udata'

The mongodb database used by udata. During tests, the test database will use the same name suffixed by -test

See the official Flask-MongoEngine documentation for more details.

Authentication is also supported in the URL:

MONGODB_HOST = 'mongodb://<user>:<password>@<host>:<port>/<database>'

MONGODB_HOST_TEST¶

default: same as MONGODB_HOST with a -test suffix on the collection

An optional alternative mongo database used for testing.

Celery options¶

By default, udata is configured to use Redis as Celery backend and a customized MongoDB scheduler.

The defaults are:

CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_BROKER_TRANSPORT_OPTIONS = {
    'fanout_prefix': True,
    'fanout_patterns': True,
}
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['pickle', 'json']
CELERY_WORKER_HIJACK_ROOT_LOGGER = False
CELERY_BEAT_SCHEDULER = 'udata.tasks.Scheduler'
CELERY_MONGODB_SCHEDULER_COLLECTION = "schedules"

Authentication is supported on Redis:

CELERY_RESULT_BACKEND = 'redis://u:<password>@<host>:<port>'
CELERY_BROKER_URL = 'redis://u:<password>@<host>:<port>'

You can see the full list of Celery options in the Celery official documentation.

Note Celery parameters changed in udata 1.2 because Celery has been upgraded to 4.1.0. (You can get the change map here). udata expect Celery parameters to be upper case and prefixed by CELERY_ in your udata.cfg and they will be automatically transformed for Celery 4.x: Example: - Celery 3.x expected BROKER_URL and Celery 4.x expects broker_url so you need to change BROKER_URL to CELERY_BROKER_URL in your settings - Celery 3.X expected CELERY_RESULT_BACKEND and Celery 4.x expects result_backend so you can leave CELERY_RESULT_BACKEND

Flask-Mail options¶

You can see the full configuration option list in the official Flask-Mail documentation.

MAIL_DEFAULT_SENDER¶

default: 'webmaster@udata'

The default identity used for outgoing mails.

Authlib options¶

udata uses Authlib to provide OAuth2 on the API. The full option list is available in the official Authlib documentation

OAUTH2_TOKEN_EXPIRES_IN¶

default:

    {
        'authorization_code': 10 * 24 * HOUR,
        'implicit': 10 * 24 * HOUR,
        'password': 10 * 24 * HOUR,
        'client_credentials': 10 * 24 * HOUR
    }

The OAuth2 token duration.

OAUTH2_PROVIDER_ERROR_ENDPOINT¶

default: 'oauth.oauth_error'

The OAuth2 error page. Do not modify unless you know what you are doing.

Flask-Security options¶

SECURITY_PASSWORD_LENGTH_MIN¶

default: 6

The minimum required password length.

SECURITY_PASSWORD_REQUIREMENTS_LOWERCASE¶

default: False

If set to True, the new passwords will need to contain at least one lowercase character.

SECURITY_PASSWORD_REQUIREMENTS_DIGITS¶

default: False

If set to True, the new passwords will need to contain at least one digit.

SECURITY_PASSWORD_REQUIREMENTS_UPPERCASE¶

default: False

If set to True, the new passwords will need to contain at least one uppercase character.

SECURITY_PASSWORD_REQUIREMENTS_SYMBOLS¶

default: False

If set to True, the new passwords will need to contain at least one symbol.

Flask-Cache options¶

udata uses Flask-Cache to handle cache and use Redis by default. You can see the full options list in the official Flask-Cache documentation

CACHE_TYPE¶

default: 'redis'

The cache type, which can be adjusted to your needs (ex: null, memcached)

CACHE_KEY_PREFIX¶

default: 'udata-cache'

A prefix used for cache keys to avoid conflicts with other middleware. It also allows you to use the same backend with different instances.

Flask-FS options¶

udata use Flask-FS as storage abstraction.

Flask-CDN options¶

See Flask-CDN README for detailed options.

CDN_DOMAIN¶

default: None

Set this to a domain name. If defined, udata will serve its static assets from this domain.

Avatars/identicon configuration¶

Theses settings allow you to customize avatar rendering. If defined to anything else than a falsy value, theses settings take precedence over the theme configuration and the default values.

AVATAR_PROVIDER¶

default 'internal'

Avatar provider used to render user avatars.

udata provides 3 backends:

internal: udata renders avatars itself using pydenticon
adorable: udata uses Adorable Avatars to render avatars
robohash: udata uses Robohash to render avatars

AVATAR_INTERNAL_SIZE¶

default: 7

Number of blocks (the matrix size) used by the internal provider.

E.g.: 7 will render avatars on a 7x7 matrix

AVATAR_INTERNAL_FOREGROUND¶

default: ['rgb(45,79,255)', 'rgb(254,180,44)', 'rgb(226,121,234)', 'rgb(30,179,253)', 'rgb(232,77,65)', 'rgb(49,203,115)', 'rgb(141,69,170)']

A list of foreground colors used by the internal provider to render the avatars

AVATAR_INTERNAL_BACKGROUND¶

default: 'rgb(224,224,224)'

The background color used by the internal provider

AVATAR_INTERNAL_PADDING¶

default: 10

The padding (in percent) used by the internal provider

AVATAR_ROBOHASH_SKIN¶

default: 'set1'

The skin (set) used by the robohash provider. See https://robohash.org/ for more details.

AVATAR_ROBOHASH_BACKGROUND¶

default: 'bg0' (transparent background)

The background used by the robohash provider. See https://robohash.org/ for more details.

Posts configuration¶

Theses settings allow you to customize the post feature.

POST_DISCUSSIONS_ENABLED¶

default False

Whether or not discussions should be enabled on posts

POST_DEFAULT_PAGINATION¶

default 20

The default page size for post listing

Datasets configuration¶

DATASET_MAX_RESOURCES_UNCOLLAPSED¶

default 6

Max number of resources to display uncollapsed in dataset view.

Sentry configuration¶

SENTRY_DSN¶

default: None

The Sentry DSN associated to this udata instance. If defined, the Sentry support is automatically activated.

sentry-sdk[flask] needs to be installed for this to work. This requirement is specified in requirements/sentry.pip.

SENTRY_TAGS¶

default: {}

A key-value map of extra tags to pass as Sentry context. See: https://docs.sentry.io/learn/context/

SENTRY_USER_ATTRS¶

default: ['slug', 'email', 'fullname']

Extra user attributes to add the Sentry context. See: https://docs.sentry.io/learn/context/

SENTRY_LOGGING¶

default: 'WARNING'

Minimum log level to be reported to Sentry.

SENTRY_IGNORE_EXCEPTIONS¶

default: []

A list of extra exceptions to ignore. udata already ignores Werkzeug HTTPException and some internal ones that don’t need to be listed here.

Read only mode¶

READ_ONLY_MODE¶

default: False

Enables the app’s read only mode.

METHOD_BLOCKLIST¶

default: ['OrganizationListAPI.post', 'ReuseListAPI.post', 'DatasetListAPI.post', 'CommunityResourcesAPI.post', 'UploadNewCommunityResources.post', 'DiscussionAPI.post', 'DiscussionsAPI.post', 'IssuesAPI.post', 'IssueAPI.post', 'SourcesAPI.post', 'FollowAPI.post']

List of API’s endpoints to block when READ_ONLY_MODE is set to True. Endpoints listed here will return a 423 response code to any non-admin request.

Fixtures¶

FIXTURE_DATASET_SLUGS¶

default: []

List of datasets slugs to query to fill the fixture file.

Example configuration file¶

Here a sample configuration file:

DEBUG = True
SEND_MAIL = False

SECRET_KEY = 'A unique secret key'

SERVER_NAME = 'www.data.dev'

DEFAULT_LANGUAGE = 'fr'
PLUGINS = ['front', 'piwik']
SITE_ID = 'www.data.dev'
SITE_TITLE = 'data.dev'
SITE_URL = 'www.data.dev'

DEBUG_TOOLBAR = True

FS_PREFIX = '/s'
FS_ROOT = '/srv/http/www.data.dev/fs'