Source - Base source class

Built-in functionality

The Source base class provides built in keys which can be set when instantiating any Source.

  • Directory

    The directory variable can be set for all sources of a type in project.conf or per source within a element.

    This sets the location within the build root that the content of the source will be loaded in to. If the location does not exist, it will be created.

Abstract Methods

For loading and configuration purposes, Sources must implement the Plugin base class abstract methods.

Attention

In order to ensure that all configuration data is processed at load time, it is important that all URLs have been processed during Plugin.configure().

Source implementations must either call Source.translate_url() or Source.mark_download_url() for every URL that has been specified in the configuration during Plugin.configure()

Sources expose the following abstract methods. Unless explicitly mentioned, these methods are mandatory to implement.

Working with the source ref

The SourceRef is used to determine the exact version of data to be addressed by the source.

The various responsibilities involving the source reference are described here.

Loading and saving

The source reference is expected to be loaded at Plugin.configure() and and Source.load_ref() time from the provided MappingNode.

The SourceRef should be loaded from a single key in that node, the recommended name for that key is ref, but is ultimately up to the implementor to decide.

When Source.set_ref() is called, the source reference should be assigned to the same single key in the provided MappingNode, this will be used to serialize changed source references to YAML as a result of tracking.

Tracking new references

When the user tracks for new versions of the source, then the new SourceRef should be returned from the Source.track() implementation.

Managing internal state

Internally the source implementation is expected to keep track of its SourceRef. The internal state should be updated when Plugin.configure(), Source.load_ref() or Source.set_ref() is called.

The internal state should not be updated when Source.track() is called.

The internal source ref must be returned on demand whenever Source.get_ref() is called.

Generating the unique key

When Plugin.get_unique_key() is called, the source’s SourceRef must be considered as a part of that key.

The unique key will be used to generate the cache key of cache keys of elements using this source, and so the unique key should be comprised of every configuration which may effect how the source is staged, as well as any configuration which uniquely identifies the source, which of course includes the SourceRef.

When plugins generate SourceInfo, it is also important that any configuration attributes which contribute to the generation of SourceInfo also be included in the unique key.

Accessing previous sources

In the general case, all sources are fetched and tracked independently of one another. In situations where a source needs to access previous source(s) in order to perform its own track and/or fetch, following attributes can be set to request access to previous sources:

The intended use of such plugins is to fetch external dependencies of other sources, typically using some kind of package manager, such that all the dependencies of the original source(s) are available at build time.

When implementing such a plugin, implementors should adhere to the following guidelines:

  • Implementations must be able to store the obtained artifacts in a subdirectory.

  • Implementations must be able to deterministically generate a unique ref, such that two refs are different if and only if they produce different outputs.

  • Implementations must not introduce host contamination.

Generating SourceInfo for provenance information

Source plugins should implement either of the Source.collect_source_info() or SourceFetcher.get_source_info() methods in order to properly report provenance information and contribute to reports such as SBoMs.

To implement these methods, you must use Source.create_source_info() to instantiate the SourceInfo object to return from these methods.

Attention

It is not recommented to consider the parameters used for implementing tracking with Source.track().

Instead, any versioning information reported should be congruent with the URL and the current source reference.

Furthermore, if any of the configuration attributes implemented by the plugin contribute to the generation of the SourceInfo objects, these configuration values must be considered in the plugin’s Plugin.get_unique_key() implementation.

What follows here, are some guidelines and conventions for doing this properly.

The URL

The URL argument represents the location from which the source is obtained, and should normally be the translated URL, as returned by Source.translate_url().

In the case of SourceInfoMedium.LOCAL, the URL can instead be a project relative path to the local data.

The medium and version_type arguments

These refer to the medium by which the source data was obtained, and the meaning/type of the following “version” argument, respectively.

When possible, you should use the SourceInfoMedium and SourceVersionType values which correspond to the the medium and version type which your Source plugin is using.

In cases where there is not a suitable value available for your plugin, you can alternatively provide a freeform string which provides these.

Documentation

Your plugin’s module level docstring which is used for documenting your plugin, should have a section describing the meaning of these values.

This is especially useful to promote interoperability with other tooling, which might want to perform some automations based on the SourceInfo object(s) which your plugin reports.

Version

This is a string which uniquely identifies the version of the source, and its meaning is described by the “version_type” you specified.

Version guess

This is a human readable simplified version, more suitable for a cursory reading of a report like an SBoM.

Since it is, in most cases not possible to accurately automate the version string intended by upstream maintainers based on the knowledge you have, we refer to this as a guessed version. For example, just because you have a tarball named pony-1.2.3.tgz somewhere, does not guarantee that this is really version 1.2.3 of the “pony” project.

Configurability

When implementing a technique for guessing the version based on the information you have at hand, it is recommended to provide some flexability to users of your plugin, who may have better knowledge about the conventions used by the upstream project and how they choose to express their versioning information.

An example of this is the version-guess-pattern configuration made available in the DownloadableFileSource built-in functionality.

Explicit versioning

In some use cases, it is impossible to derive a guessed version from the information available to the plugin.

For instance, consider an upstream which indexes their releases on a web page and then hosts their releases without namespacing their release archives. In such a case you might have a URL that looks something like: https://flying-ponies.com/releases/9d0c936c78/pony-flight-release.tgz

For this reason, the implementing plugin should provide a way for users to manually annotate the source version.

An example of this is the version configuration made available in the DownloadableFileSource built-in functionality.

Extra data

In the case that the existing fields are insufficient to accurately describe the provenance of this source, extra key/values can be specified when calling Source.create_source_info().

SourceFetcher - Object for fetching individual URLs

Abstract Methods

SourceFetchers expose the following abstract methods. Unless explicitly mentioned, these methods are mandatory to implement.

  • SourceFetcher.fetch()

    Fetches the URL associated with this SourceFetcher, optionally taking an alias override.

  • SourceFetcher.get_source_info()

    Get a SourceInfo object to describe the provenance of this source.

    Optional: BuildStream will function correctly if this is unimplemented, but the ability to generate SBoMs will be impaired, it is highly recommented to implement this.

Class Reference

exception SourceError(message: str, *, detail: str | None = None, reason: str | None = None, temporary: bool = False)

Bases: BstError

This exception should be raised by Source implementations to report errors to the user.

Parameters:
  • message – The breif error description to report to the user

  • detail – A possibly multiline, more detailed error message

  • reason – An optional machine readable reason string, used for test cases

  • temporary – An indicator to whether the error may occur if the operation was run again.

exception SourceImplError(message, reason=None)

Bases: BstError

This exception is expected to be raised from some unimplemented abstract methods.

There is no need to raise this exception, however some public abstract methods which are intended to be called by plugins may advertize the raising of this exception in the case of a source plugin which does not implement the said method, in which case it must be handled by the calling plugin.

class AliasSubstitution

Bases: object

An opaque data structure which may be passed through SourceFetcher.fetch() and in such cases must be provided to Source.translate_url().

class SourceInfoMedium(value)

Bases: FastEnum

Indicates the medium in which the source is obtained

Since: 2.5

WORKSPACE = <buildstream.source.SourceInfoMedium object>

Files in an open workspace

LOCAL = <buildstream.source.SourceInfoMedium object>

Files stored locally in the project

REMOTE_FILE = <buildstream.source.SourceInfoMedium object>

A remote file

GIT = <buildstream.source.SourceInfoMedium object>

A git repository

BAZAAR = <buildstream.source.SourceInfoMedium object>

The Bazaar revision control system

OCI_IMAGE = <buildstream.source.SourceInfoMedium object>

An OCI image, such as docker or podman images.

PYTHON_PACKAGE_INDEX = <buildstream.source.SourceInfoMedium object>

//pypi.org

Type:

A python package obtained from a python package index like https

class SourceVersionType(value)

Bases: FastEnum

Indicates the type of the version string

Since: 2.5

COMMIT = <buildstream.source.SourceVersionType object>

A commit string which accurately represents a version in a source code repository or VCS

SHA256 = <buildstream.source.SourceVersionType object>

An sha256 checksum of the content of a file

CAS_DIGEST = <buildstream.source.SourceVersionType object>

A CAS digest expressed as {hash}/{size}.

The hash and size components represent the members of a Digest message as defined in the remote execution protocol

OCI_DIGEST = <buildstream.source.SourceVersionType object>

An OCI image digest, as can be used to address images in a docker registry.

INDEXED_VERSION = <buildstream.source.SourceVersionType object>

This type of version is used in cases where we have repositories which have an interface to index content by version, and that no additional validation is performed to insure the uniqueness of the downloaded content (not recommended).

In the case of plugins which use this version type, it is probable that SourceInfo.version_guess == SourceInfo.version.

class SourceInfo

Bases: object

An object representing the provenance of input reported by Source.collect_source_info() and/or SourceFetcher.get_source_info()

See: documentation on generating SourceInfo.

Attention

A given SourceInfo for a given element is not guaranteed to be unique for a given cache key.

While it is true that plugins which generate SourceInfo must consider any configuration attributes in their cache keys, so as to produce differing cache keys when source provenance information can be reported differently, this does not account for the special nature of urls.

When considering the urls reported in SourceInfo, the urls are only guaranteed to be the primary urls as defined by the project’s source aliases, and arbitrary mirror urls will not be reported here.

Since these aliases are intentionally allowed to change without affecting cache keys, or can be redirected with junctions, it possible to have a differing set of SourceInfo objects reported for a project which reports identical cache keys, in cases where primary alias mappings are changed.

Since: 2.5

kind: str

The Source plugin kind which reported this SourceInfo

url: str

The url of the source input

medium: SourceInfoMedium | str

The SourceInfoMedium of the source input, or in the case that an appropriate medium is not defined, a freeform string of the plugin’s choice describing the medium.

version_type: SourceVersionType | str

The SourceVersionType of the source input version, or in the case that an appropriate version type is not defined, a freeform string of the plugin’s choice depicting the type of version.

version: str

A string which represents a unique version of this source input

version_guess: str | None

A string representing the guessed human readable version of this source input

extra_data: Dict[str, str] | None

Additional plugin defined key/values

class SourceFetcher

Bases: object

This interface exists so that a source that downloads from multiple places (e.g. a git source with submodules) has a consistent interface for fetching and substituting aliases.

Attention

When implementing a SourceFetcher, remember to call Source.mark_download_url() for every URL found in the configuration data at Plugin.configure() time.

fetch(alias_override: AliasSubstitution | None = None, **kwargs) None

Fetch remote sources and mirror them locally, ensuring at least that the specific reference is cached locally.

Parameters:

alias_override – The alias to use instead of the default one defined by the aliases field in the project’s config. If provided, it must be used when calling Source.translate_url().

Raises:

.SourceError

Implementors should raise SourceError if the there is some network error or if the source reference could not be matched.

get_source_info() SourceInfo

Get the SourceInfo object describing this source

This method should only be called whenever Source.is_resolved() returns True.

SourceInfo objects created by implementors should be created with Source.create_source_info().

Returns: the SourceInfo objects describing this source

Raises:

.SourceImplError – if this method is unimplemented

Since: 2.5

mark_download_url(url: str) None

Identifies the URL that this SourceFetcher uses to download

This must be called during the fetcher’s initialization

Parameters:

url – The url used to download.

Note

While this must be called in a SourceFetcher initializer for the URL which will be used by the fetcher, note that any URLs which are known and specified in the Source configuration YAML must be marked with either Source.mark_download_url() or Source.translate_url() in the Plugin.configure() implementation.

class Source

Bases: Plugin

Base Source class.

All Sources derive from this class, this interface defines how the core will be interacting with Sources.

BST_REQUIRES_PREVIOUS_SOURCES_TRACK = False

Whether access to previous sources is required during track

When set to True:
  • all sources listed before this source in the given element will be fetched before this source is tracked

  • Source.track() will be called with an additional keyword argument previous_sources_dir where previous sources will be staged

  • this source can not be the first source for an element

BST_REQUIRES_PREVIOUS_SOURCES_FETCH = False

Whether access to previous sources is required during fetch

When set to True:
  • all sources listed before this source in the given element will be fetched before this source is fetched

  • Source.fetch() will be called with an additional keyword argument previous_sources_dir where previous sources will be staged

  • this source can not be the first source for an element

BST_REQUIRES_PREVIOUS_SOURCES_STAGE = False

Whether access to previous sources is required during cache

When set to True:
  • All sources listed before current source in the given element will be staged with the source when it’s cached.

  • This source can not be the first source for an element.

BST_STAGE_VIRTUAL_DIRECTORY = False

Whether we can stage this source directly to a virtual directory

When set to True, Source.stage_directory() and Source.init_workspace_directory() will be called in place of Source.stage() and Source.init_workspace() respectively.

COMMON_CONFIG_KEYS = ['kind', 'directory']

Common source config keys

Source config keys that must not be accessed in configure(), and should be checked for using node.validate_keys().

load_ref(node: MappingNode) None

Loads the SourceRef for this Source from the specified node.

Parameters:

node – The YAML node to load the ref from

Working with the source ref is discussed here.

Note

The SourceRef for the Source is expected to be read at Plugin.configure() time, this will only be used for loading refs from alternative locations than in the element.bst file where the given Source object has been declared.

get_ref() None | int | str | List[Any] | Dict[str, Any]

Fetch the SourceRef

Returns:

The internal SourceRef, or None

Working with the source ref is discussed here.

set_ref(ref: None | int | str | List[Any] | Dict[str, Any], node: MappingNode) None

Applies the internal ref, however it is represented

Parameters:

The implementor must update the node parameter to reflect the new ref, and it should store the passed ref so that it will be returned in any later calls to Source.get_ref().

The passed ref parameter is guaranteed to either be a value which has been previously retrieved by the Source.get_ref() method on the same plugin, or None.

Example:

# Implementation of Source.set_ref()
#
def set_ref(self, ref, node):

    # Update internal state of the ref
    self.ref = ref

    # Update the passed node so that we will read the new ref
    # next time this source plugin is configured with this node.
    #
    node["ref"] = self.ref

Working with the source ref is discussed here.

track(*, previous_sources_dir: str | None = None) None | int | str | List[Any] | Dict[str, Any]

Resolve a new ref from the plugin’s track option

Parameters:

previous_sources_dir (str) – directory where previous sources are staged. Note that this keyword argument is available only when BST_REQUIRES_PREVIOUS_SOURCES_TRACK is set to True.

Returns:

A new SourceRef, or None

If the backend in question supports resolving references from a symbolic tracking branch or tag, then this should be implemented to perform this task on behalf of bst source track commands.

This usually requires fetching new content from a remote origin to see if a new ref has appeared for your branch or tag. If the backend store allows one to query for a new ref from a symbolic tracking data without downloading then that is desirable.

Working with the source ref is discussed here.

fetch(*, previous_sources_dir: str | None = None) None

Fetch remote sources and mirror them locally, ensuring at least that the specific reference is cached locally.

Parameters:

previous_sources_dir (str) – directory where previous sources are staged. Note that this keyword argument is available only when BST_REQUIRES_PREVIOUS_SOURCES_FETCH is set to True.

Raises:

.SourceError

Implementors should raise SourceError if the there is some network error or if the source reference could not be matched.

stage(directory: str) None

Stage the sources to a directory

Parameters:

directory – Path to stage the source

Raises:

.SourceError

Implementors should assume that directory already exists and stage already cached sources to the passed directory.

Implementors should raise SourceError when encountering some system error.

stage_directory(directory: Directory) None

Stage the sources to a directory

Parameters:

directoryDirectory object to stage the source into

Raises:

.SourceError

Implementors should assume that directory represents an existing directory root into which the source content can be populated.

Implementors should raise SourceError when encountering some system error.

Note

This will be called instead of Source.stage() in the case that BST_STAGE_VIRTUAL_DIRECTORY is set for this plugin.

init_workspace(directory: str) None

Stage sources for use as a workspace.

Parameters:

directory – Path of the workspace to initialize.

Raises:

.SourceError

Default implementation is to call Source.stage().

Implementors overriding this method should assume that directory already exists.

Implementors should raise SourceError when encountering some system error.

init_workspace_directory(directory: Directory) None

Stage sources for use as a workspace.

Parameters:

directoryDirectory object of the workspace to initialize.

Raises:

.SourceError

Default implementation is to call Source.stage_directory().

Implementors overriding this method should assume that directory already exists.

Implementors should raise SourceError when encountering some system error.

Note

This will be called instead of Source.init_workspace() in the case that BST_STAGE_VIRTUAL_DIRECTORY is set for this plugin.

get_source_fetchers() Iterable[SourceFetcher]

Get the objects that are used for fetching

If this source doesn’t download from multiple URLs, returning None and falling back on the default behaviour is recommended.

Returns:

The Source’s SourceFetchers, if any.

Note

Implementors can implement this as a generator.

The SourceFetcher.fetch() method will be called on the returned fetchers one by one, before consuming the next fetcher in the list.

validate_cache() None

Implement any validations once we know the sources are cached

This is guaranteed to be called only once for a given session once the sources are known to be cached, before Source.stage() or Source.init_workspace() is called.

is_cached() bool

Get whether the source has a local copy of its data.

This method is guaranteed to only be called whenever Source.is_resolved() returns True.

Returns: whether the source is cached locally or not.

collect_source_info() Iterable[SourceInfo]

Get the SourceInfo objects describing this source

This method should only be called whenever Source.is_resolved() returns True.

SourceInfo objects created by implementors should be created with Source.create_source_info().

Returns: the SourceInfo objects describing this source

Raises:

.SourceImplError – if the source class does not implement this method and does not implement SourceFether.get_source_info()

Note

If your plugin uses SourceFetcher objects, you can implement Source.collect_source_info() instead.

Since: 2.5

get_mirror_directory() str

Fetches the directory where this source should store things

Returns:

The directory belonging to this source

translate_url(url: str, *, alias_override: AliasSubstitution | None = None, primary: bool = True, suffix: str | None = None, extra_data: Dict[str, Any] | None = None) str

Translates the given url which may be specified with an alias into a fully qualified url.

Parameters:
  • url – A URL, which may be using an alias

  • alias_override – Optionally, an URI to override the alias with.

  • primary – Whether this is the primary URL for the source.

  • suffix – an optional suffix to append to the URL (Since: 2.2)

  • extra_data – Additional data provided by SourceMirror (Since: 2.2)

Returns:

The fully qualified URL, with aliases resolved

Note

This must be called for every URL in the configuration during Plugin.configure() if Source.mark_download_url() is not called.

The suffix argument may be used to translate URLs for which only the base portion of the URL was previously marked with Source.mark_download_url() at Plugin.configure() time.

mark_download_url(url: str, *, primary: bool = True) None

Identifies the URL that this Source uses to download

Parameters:
  • url (str) – The URL used to download

  • primary (bool) – Whether this is the primary URL for the source

Note

This must be called for every URL in the configuration during Plugin.configure() if Source.translate_url() is not called.

get_project_directory() str

Fetch the project base directory

This is useful for sources which need to load resources stored somewhere inside the project.

Returns:

The project base directory

tempdir() Iterator[str]

Context manager for working in a temporary directory

Yields:

A path to a temporary directory

This should be used by source plugins directly instead of the tempfile module. This one will automatically cleanup in case of termination by catching the signal before os._exit(). It will also use the ‘mirror directory’ as expected for a source.

is_resolved() bool

Get whether the source is resolved.

This has a default implementation that checks whether the source has a ref or not. If it has a ref, it is assumed to be resolved.

Sources that never have a ref or have uncommon requirements can override this method to specify when they should be considered resolved

Returns: whether the source is fully resolved or not

create_source_info(url: str, medium: SourceInfoMedium | str, version_type: SourceVersionType | str, version: str, *, version_guess: str | None = None, extra_data: Dict[str, str] | None = None) SourceInfo

Create a SourceInfo object

This function should be used to generate SourceInfo objects in Source.is_resolved() and Source.is_resolved() implementations.

Parameters:
  • url – The translated URL

  • medium – The SourceInfoMedium of the source input, or in the case that an appropriate medium is not defined, a freeform string of the plugin’s choice describing the medium.

  • version_type – The SourceVersionType of the source input version, or in the case that an appropriate version type is not defined, a freeform string of the plugin’s choice depicting the type of version.

  • version – A string which represents a unique version of this source input

  • version_guess – An optional string representing the guessed human readable version

  • extra_data – Additional plugin defined key/values

Since: 2.5