Source - Base source class
Built-in functionality
The Source base class provides built in keys which can be set when instantiating any Source.
Directory
The
directory
variable can be set for all sources of a type in project.conf or per source within a element.This sets the location within the build root that the content of the source will be loaded in to. If the location does not exist, it will be created.
Abstract Methods
For loading and configuration purposes, Sources must implement the Plugin base class abstract methods.
Attention
In order to ensure that all configuration data is processed at
load time, it is important that all URLs have been processed during
Plugin.configure()
.
Source implementations must either call
Source.translate_url()
or
Source.mark_download_url()
for every URL that has been specified in the configuration during
Plugin.configure()
Sources expose the following abstract methods. Unless explicitly mentioned, these methods are mandatory to implement.
-
Load the ref from a specific YAML node
-
Fetch the source ref
-
Set a new ref explicitly
-
Automatically derive a new ref from a symbolic tracking branch
-
Fetch the actual payload for the currently set ref
Source.stage()
/Source.stage_directory()
Stage the sources for a given ref at a specified location
Source.init_workspace()
/Source.init_workspace_workspace()
Stage sources for use as a workspace.
Optional: If left unimplemented, these will default to calling
Source.stage()
/Source.stage_directory()
-
Get the objects that are used for fetching.
Optional: This only needs to be implemented for sources that need to download from multiple URLs while fetching (e.g. a git repo and its submodules). For details on how to define a SourceFetcher, see SourceFetcher.
-
Perform any validations which require the sources to be cached.
Optional: This is completely optional and will do nothing if left unimplemented.
-
Collect SourceInfo objects to describe the provenance of sources.
Optional: BuildStream will function correctly if this is unimplemented, but the ability to generate SBoMs will be impaired, it is highly recommented to implement this.
Working with the source ref
The SourceRef
is used to determine the exact
version of data to be addressed by the source.
The various responsibilities involving the source reference are described here.
Loading and saving
The source reference is expected to be loaded at
Plugin.configure()
and
and Source.load_ref()
time
from the provided MappingNode
.
The SourceRef
should be loaded from a single key
in that node, the recommended name for that key is ref, but is ultimately up
to the implementor to decide.
When Source.set_ref()
is called,
the source reference should be assigned to the same single key in the
provided MappingNode
, this will be used to serialize changed
source references to YAML as a result of tracking.
Tracking new references
When the user tracks for new versions of the source,
then the new SourceRef
should be returned from
the Source.track()
implementation.
Managing internal state
Internally the source implementation is expected to keep track of its
SourceRef
. The internal state should be
updated when Plugin.configure()
,
Source.load_ref()
or
Source.set_ref()
is called.
The internal state should not be updated when
Source.track()
is called.
The internal source ref must be returned on demand whenever
Source.get_ref()
is called.
Generating the unique key
When Plugin.get_unique_key()
is called, the source’s SourceRef
must be considered
as a part of that key.
The unique key will be used to generate the cache key of cache keys
of elements using this source, and so the unique key should be comprised of every
configuration which may effect how the source is staged
,
as well as any configuration which uniquely identifies the source, which of course
includes the SourceRef
.
When plugins generate SourceInfo, it is also important that any configuration attributes which contribute to the generation of SourceInfo also be included in the unique key.
Accessing previous sources
In the general case, all sources are fetched and tracked independently of one another. In situations where a source needs to access previous source(s) in order to perform its own track and/or fetch, following attributes can be set to request access to previous sources:
BST_REQUIRES_PREVIOUS_SOURCES_TRACK
Indicate that access to previous sources is required during track
BST_REQUIRES_PREVIOUS_SOURCES_FETCH
Indicate that access to previous sources is required during fetch
The intended use of such plugins is to fetch external dependencies of other sources, typically using some kind of package manager, such that all the dependencies of the original source(s) are available at build time.
When implementing such a plugin, implementors should adhere to the following guidelines:
Implementations must be able to store the obtained artifacts in a subdirectory.
Implementations must be able to deterministically generate a unique ref, such that two refs are different if and only if they produce different outputs.
Implementations must not introduce host contamination.
Generating SourceInfo for provenance information
Source plugins should implement either of the
Source.collect_source_info()
or
SourceFetcher.get_source_info()
methods in order to properly report provenance information and contribute to reports
such as SBoMs.
To implement these methods, you must use
Source.create_source_info()
to
instantiate the SourceInfo
object to return from these methods.
Attention
It is not recommented to consider the parameters used for implementing
tracking with Source.track()
.
Instead, any versioning information reported should be congruent with the URL and the current source reference.
Furthermore, if any of the configuration attributes implemented by the plugin
contribute to the generation of the SourceInfo objects, these configuration
values must be considered in the plugin’s
Plugin.get_unique_key()
implementation.
What follows here, are some guidelines and conventions for doing this properly.
The URL
The URL argument represents the location from which the source is obtained, and
should normally be the translated URL, as returned by
Source.translate_url()
.
In the case of SourceInfoMedium.LOCAL
, the URL can instead be a project
relative path to the local data.
The medium and version_type arguments
These refer to the medium by which the source data was obtained, and the meaning/type of the following “version” argument, respectively.
When possible, you should use the SourceInfoMedium
and
SourceVersionType
values which correspond to the the medium
and version type which your Source plugin is using.
In cases where there is not a suitable value available for your plugin, you can alternatively provide a freeform string which provides these.
Documentation
Your plugin’s module level docstring which is used for documenting your plugin, should have a section describing the meaning of these values.
This is especially useful to promote interoperability with other tooling,
which might want to perform some automations based on the SourceInfo
object(s)
which your plugin reports.
Version
This is a string which uniquely identifies the version of the source, and its meaning is described by the “version_type” you specified.
Version guess
This is a human readable simplified version, more suitable for a cursory reading of a report like an SBoM.
Since it is, in most cases not possible to accurately automate the version string
intended by upstream maintainers based on the knowledge you have, we refer to this
as a guessed version. For example, just because you have a tarball named pony-1.2.3.tgz
somewhere, does not guarantee that this is really version 1.2.3
of the “pony” project.
Configurability
When implementing a technique for guessing the version based on the information you have at hand, it is recommended to provide some flexability to users of your plugin, who may have better knowledge about the conventions used by the upstream project and how they choose to express their versioning information.
An example of this is the version-guess-pattern
configuration made available
in the DownloadableFileSource built-in functionality.
Explicit versioning
In some use cases, it is impossible to derive a guessed version from the information available to the plugin.
For instance, consider an upstream which indexes their releases on a web page and
then hosts their releases without namespacing their release archives. In such
a case you might have a URL that looks something like:
https://flying-ponies.com/releases/9d0c936c78/pony-flight-release.tgz
For this reason, the implementing plugin should provide a way for users to manually annotate the source version.
An example of this is the version
configuration made available in the
DownloadableFileSource built-in functionality.
Extra data
In the case that the existing fields are insufficient to accurately describe the
provenance of this source, extra key/values can be specified when calling
Source.create_source_info()
.
SourceFetcher - Object for fetching individual URLs
Abstract Methods
SourceFetchers expose the following abstract methods. Unless explicitly mentioned, these methods are mandatory to implement.
-
Fetches the URL associated with this SourceFetcher, optionally taking an alias override.
SourceFetcher.get_source_info()
Get a SourceInfo object to describe the provenance of this source.
Optional: BuildStream will function correctly if this is unimplemented, but the ability to generate SBoMs will be impaired, it is highly recommented to implement this.
Class Reference
- exception SourceError(message: str, *, detail: str | None = None, reason: str | None = None, temporary: bool = False)
Bases:
BstError
This exception should be raised by
Source
implementations to report errors to the user.- Parameters:
message – The breif error description to report to the user
detail – A possibly multiline, more detailed error message
reason – An optional machine readable reason string, used for test cases
temporary – An indicator to whether the error may occur if the operation was run again.
- exception SourceImplError(message, reason=None)
Bases:
BstError
This exception is expected to be raised from some unimplemented abstract methods.
There is no need to raise this exception, however some public abstract methods which are intended to be called by plugins may advertize the raising of this exception in the case of a source plugin which does not implement the said method, in which case it must be handled by the calling plugin.
- class AliasSubstitution
Bases:
object
An opaque data structure which may be passed through
SourceFetcher.fetch()
and in such cases must be provided toSource.translate_url()
.
- class SourceInfoMedium(value)
Bases:
FastEnum
Indicates the medium in which the source is obtained
Since: 2.5
- WORKSPACE = <buildstream.source.SourceInfoMedium object>
Files in an open workspace
- LOCAL = <buildstream.source.SourceInfoMedium object>
Files stored locally in the project
- REMOTE_FILE = <buildstream.source.SourceInfoMedium object>
A remote file
- GIT = <buildstream.source.SourceInfoMedium object>
A git repository
- BAZAAR = <buildstream.source.SourceInfoMedium object>
The Bazaar revision control system
- OCI_IMAGE = <buildstream.source.SourceInfoMedium object>
An OCI image, such as docker or podman images.
- PYTHON_PACKAGE_INDEX = <buildstream.source.SourceInfoMedium object>
//pypi.org
- Type:
A python package obtained from a python package index like https
- class SourceVersionType(value)
Bases:
FastEnum
Indicates the type of the version string
Since: 2.5
- COMMIT = <buildstream.source.SourceVersionType object>
A commit string which accurately represents a version in a source code repository or VCS
- SHA256 = <buildstream.source.SourceVersionType object>
An sha256 checksum of the content of a file
- CAS_DIGEST = <buildstream.source.SourceVersionType object>
A CAS digest expressed as
{hash}/{size}
.The
hash
andsize
components represent the members of aDigest
message as defined in the remote execution protocol
- OCI_DIGEST = <buildstream.source.SourceVersionType object>
An OCI image digest, as can be used to address images in a docker registry.
- INDEXED_VERSION = <buildstream.source.SourceVersionType object>
This type of version is used in cases where we have repositories which have an interface to index content by version, and that no additional validation is performed to insure the uniqueness of the downloaded content (not recommended).
In the case of plugins which use this version type, it is probable that
SourceInfo.version_guess == SourceInfo.version
.
- class SourceInfo
Bases:
object
An object representing the provenance of input reported by
Source.collect_source_info()
and/orSourceFetcher.get_source_info()
See: documentation on generating SourceInfo.
Attention
A given SourceInfo for a given element is not guaranteed to be unique for a given cache key.
While it is true that plugins which generate SourceInfo must consider any configuration attributes in their cache keys, so as to produce differing cache keys when source provenance information can be reported differently, this does not account for the special nature of urls.
When considering the urls reported in SourceInfo, the urls are only guaranteed to be the primary urls as defined by the project’s source aliases, and arbitrary mirror urls will not be reported here.
Since these aliases are intentionally allowed to change without affecting cache keys, or can be redirected with junctions, it possible to have a differing set of SourceInfo objects reported for a project which reports identical cache keys, in cases where primary alias mappings are changed.
Since: 2.5
- kind: str
The Source plugin kind which reported this SourceInfo
- url: str
The url of the source input
- medium: SourceInfoMedium | str
The
SourceInfoMedium
of the source input, or in the case that an appropriate medium is not defined, a freeform string of the plugin’s choice describing the medium.
- version_type: SourceVersionType | str
The
SourceVersionType
of the source input version, or in the case that an appropriate version type is not defined, a freeform string of the plugin’s choice depicting the type of version.
- version: str
A string which represents a unique version of this source input
- version_guess: str | None
A string representing the guessed human readable version of this source input
- extra_data: Dict[str, str] | None
Additional plugin defined key/values
- class SourceFetcher
Bases:
object
This interface exists so that a source that downloads from multiple places (e.g. a git source with submodules) has a consistent interface for fetching and substituting aliases.
Attention
When implementing a SourceFetcher, remember to call
Source.mark_download_url()
for every URL found in the configuration data atPlugin.configure()
time.- fetch(alias_override: AliasSubstitution | None = None, **kwargs) None
Fetch remote sources and mirror them locally, ensuring at least that the specific reference is cached locally.
- Parameters:
alias_override – The alias to use instead of the default one defined by the aliases field in the project’s config. If provided, it must be used when calling
Source.translate_url()
.- Raises:
.SourceError –
Implementors should raise
SourceError
if the there is some network error or if the source reference could not be matched.
- get_source_info() SourceInfo
Get the
SourceInfo
object describing this sourceThis method should only be called whenever
Source.is_resolved()
returnsTrue
.SourceInfo objects created by implementors should be created with
Source.create_source_info()
.Returns: the
SourceInfo
objects describing this source- Raises:
.SourceImplError – if this method is unimplemented
Since: 2.5
- mark_download_url(url: str) None
Identifies the URL that this SourceFetcher uses to download
This must be called during the fetcher’s initialization
- Parameters:
url – The url used to download.
Note
While this must be called in a SourceFetcher initializer for the URL which will be used by the fetcher, note that any URLs which are known and specified in the Source configuration YAML must be marked with either
Source.mark_download_url()
orSource.translate_url()
in thePlugin.configure()
implementation.
- class Source
Bases:
Plugin
Base Source class.
All Sources derive from this class, this interface defines how the core will be interacting with Sources.
- BST_REQUIRES_PREVIOUS_SOURCES_TRACK = False
Whether access to previous sources is required during track
- When set to True:
all sources listed before this source in the given element will be fetched before this source is tracked
Source.track() will be called with an additional keyword argument previous_sources_dir where previous sources will be staged
this source can not be the first source for an element
- BST_REQUIRES_PREVIOUS_SOURCES_FETCH = False
Whether access to previous sources is required during fetch
- When set to True:
all sources listed before this source in the given element will be fetched before this source is fetched
Source.fetch() will be called with an additional keyword argument previous_sources_dir where previous sources will be staged
this source can not be the first source for an element
- BST_REQUIRES_PREVIOUS_SOURCES_STAGE = False
Whether access to previous sources is required during cache
- When set to True:
All sources listed before current source in the given element will be staged with the source when it’s cached.
This source can not be the first source for an element.
- BST_STAGE_VIRTUAL_DIRECTORY = False
Whether we can stage this source directly to a virtual directory
When set to True,
Source.stage_directory()
andSource.init_workspace_directory()
will be called in place ofSource.stage()
andSource.init_workspace()
respectively.
- COMMON_CONFIG_KEYS = ['kind', 'directory']
Common source config keys
Source config keys that must not be accessed in configure(), and should be checked for using node.validate_keys().
- load_ref(node: MappingNode) None
Loads the
SourceRef
for this Source from the specified node.- Parameters:
node – The YAML node to load the ref from
Working with the source ref is discussed here.
Note
The
SourceRef
for the Source is expected to be read atPlugin.configure()
time, this will only be used for loading refs from alternative locations than in the element.bst file where the given Source object has been declared.
- get_ref() None | int | str | List[Any] | Dict[str, Any]
Fetch the
SourceRef
- Returns:
The internal
SourceRef
, orNone
Working with the source ref is discussed here.
- set_ref(ref: None | int | str | List[Any] | Dict[str, Any], node: MappingNode) None
Applies the internal ref, however it is represented
- Parameters:
ref – The internal
SourceRef
to set, orNone
node – The same node which was previously passed to
Plugin.configure()
andSource.load_ref()
The implementor must update the node parameter to reflect the new ref, and it should store the passed ref so that it will be returned in any later calls to
Source.get_ref()
.The passed ref parameter is guaranteed to either be a value which has been previously retrieved by the
Source.get_ref()
method on the same plugin, orNone
.Example:
# Implementation of Source.set_ref() # def set_ref(self, ref, node): # Update internal state of the ref self.ref = ref # Update the passed node so that we will read the new ref # next time this source plugin is configured with this node. # node["ref"] = self.ref
Working with the source ref is discussed here.
- track(*, previous_sources_dir: str | None = None) None | int | str | List[Any] | Dict[str, Any]
Resolve a new ref from the plugin’s track option
- Parameters:
previous_sources_dir (str) – directory where previous sources are staged. Note that this keyword argument is available only when
BST_REQUIRES_PREVIOUS_SOURCES_TRACK
is set to True.- Returns:
A new
SourceRef
, or None
If the backend in question supports resolving references from a symbolic tracking branch or tag, then this should be implemented to perform this task on behalf of bst source track commands.
This usually requires fetching new content from a remote origin to see if a new ref has appeared for your branch or tag. If the backend store allows one to query for a new ref from a symbolic tracking data without downloading then that is desirable.
Working with the source ref is discussed here.
- fetch(*, previous_sources_dir: str | None = None) None
Fetch remote sources and mirror them locally, ensuring at least that the specific reference is cached locally.
- Parameters:
previous_sources_dir (str) – directory where previous sources are staged. Note that this keyword argument is available only when
BST_REQUIRES_PREVIOUS_SOURCES_FETCH
is set to True.- Raises:
.SourceError –
Implementors should raise
SourceError
if the there is some network error or if the source reference could not be matched.
- stage(directory: str) None
Stage the sources to a directory
- Parameters:
directory – Path to stage the source
- Raises:
.SourceError –
Implementors should assume that directory already exists and stage already cached sources to the passed directory.
Implementors should raise
SourceError
when encountering some system error.
- stage_directory(directory: Directory) None
Stage the sources to a directory
- Parameters:
directory –
Directory
object to stage the source into- Raises:
.SourceError –
Implementors should assume that directory represents an existing directory root into which the source content can be populated.
Implementors should raise
SourceError
when encountering some system error.Note
This will be called instead of
Source.stage()
in the case thatBST_STAGE_VIRTUAL_DIRECTORY
is set for this plugin.
- init_workspace(directory: str) None
Stage sources for use as a workspace.
- Parameters:
directory – Path of the workspace to initialize.
- Raises:
.SourceError –
Default implementation is to call
Source.stage()
.Implementors overriding this method should assume that directory already exists.
Implementors should raise
SourceError
when encountering some system error.
- init_workspace_directory(directory: Directory) None
Stage sources for use as a workspace.
- Parameters:
directory –
Directory
object of the workspace to initialize.- Raises:
.SourceError –
Default implementation is to call
Source.stage_directory()
.Implementors overriding this method should assume that directory already exists.
Implementors should raise
SourceError
when encountering some system error.Note
This will be called instead of
Source.init_workspace()
in the case thatBST_STAGE_VIRTUAL_DIRECTORY
is set for this plugin.
- get_source_fetchers() Iterable[SourceFetcher]
Get the objects that are used for fetching
If this source doesn’t download from multiple URLs, returning None and falling back on the default behaviour is recommended.
- Returns:
The Source’s SourceFetchers, if any.
Note
Implementors can implement this as a generator.
The
SourceFetcher.fetch()
method will be called on the returned fetchers one by one, before consuming the next fetcher in the list.
- validate_cache() None
Implement any validations once we know the sources are cached
This is guaranteed to be called only once for a given session once the sources are known to be cached, before
Source.stage()
orSource.init_workspace()
is called.
- is_cached() bool
Get whether the source has a local copy of its data.
This method is guaranteed to only be called whenever
Source.is_resolved()
returns True.Returns: whether the source is cached locally or not.
- collect_source_info() Iterable[SourceInfo]
Get the
SourceInfo
objects describing this sourceThis method should only be called whenever
Source.is_resolved()
returnsTrue
.SourceInfo objects created by implementors should be created with
Source.create_source_info()
.Returns: the
SourceInfo
objects describing this source- Raises:
.SourceImplError – if the source class does not implement this method and does not implement
SourceFether.get_source_info()
Note
If your plugin uses
SourceFetcher
objects, you can implementSource.collect_source_info()
instead.Since: 2.5
- get_mirror_directory() str
Fetches the directory where this source should store things
- Returns:
The directory belonging to this source
- translate_url(url: str, *, alias_override: AliasSubstitution | None = None, primary: bool = True, suffix: str | None = None, extra_data: Dict[str, Any] | None = None) str
Translates the given url which may be specified with an alias into a fully qualified url.
- Parameters:
url – A URL, which may be using an alias
alias_override – Optionally, an URI to override the alias with.
primary – Whether this is the primary URL for the source.
suffix – an optional suffix to append to the URL (Since: 2.2)
extra_data – Additional data provided by
SourceMirror
(Since: 2.2)
- Returns:
The fully qualified URL, with aliases resolved
Note
This must be called for every URL in the configuration during
Plugin.configure()
ifSource.mark_download_url()
is not called.The suffix argument may be used to translate URLs for which only the base portion of the URL was previously marked with
Source.mark_download_url()
atPlugin.configure()
time.
- mark_download_url(url: str, *, primary: bool = True) None
Identifies the URL that this Source uses to download
- Parameters:
url (str) – The URL used to download
primary (bool) – Whether this is the primary URL for the source
Note
This must be called for every URL in the configuration during
Plugin.configure()
ifSource.translate_url()
is not called.
- get_project_directory() str
Fetch the project base directory
This is useful for sources which need to load resources stored somewhere inside the project.
- Returns:
The project base directory
- tempdir() Iterator[str]
Context manager for working in a temporary directory
- Yields:
A path to a temporary directory
This should be used by source plugins directly instead of the tempfile module. This one will automatically cleanup in case of termination by catching the signal before os._exit(). It will also use the ‘mirror directory’ as expected for a source.
- is_resolved() bool
Get whether the source is resolved.
This has a default implementation that checks whether the source has a ref or not. If it has a ref, it is assumed to be resolved.
Sources that never have a ref or have uncommon requirements can override this method to specify when they should be considered resolved
Returns: whether the source is fully resolved or not
- create_source_info(url: str, medium: SourceInfoMedium | str, version_type: SourceVersionType | str, version: str, *, version_guess: str | None = None, extra_data: Dict[str, str] | None = None) SourceInfo
Create a
SourceInfo
objectThis function should be used to generate SourceInfo objects in
Source.is_resolved()
andSource.is_resolved()
implementations.- Parameters:
url – The translated URL
medium – The
SourceInfoMedium
of the source input, or in the case that an appropriate medium is not defined, a freeform string of the plugin’s choice describing the medium.version_type – The
SourceVersionType
of the source input version, or in the case that an appropriate version type is not defined, a freeform string of the plugin’s choice depicting the type of version.version – A string which represents a unique version of this source input
version_guess – An optional string representing the guessed human readable version
extra_data – Additional plugin defined key/values
Since: 2.5