platform
Modules¶
illumina ¶
Methods for working with Illumina-specific UMIs in SAM files¶
The functions in this module make it easy to:
- check whether a UMI is valid
- extract UMI(s) from an Illumina-style read name
- copy a UMI from an alignment's read name to its
RXSAM tag
Attributes¶
SAM_UMI_DELIMITER
module-attribute
¶
Multiple UMI delimiter, which SAM specification recommends should be a hyphen; see specification here: https://samtools.github.io/hts-specs/SAMtags.pdf
Functions¶
copy_umi_from_read_name ¶
copy_umi_from_read_name(rec: AlignedSegment, strict: bool = False, remove_umi: bool = False) -> bool
Copy a UMI from an alignment's read name to its RX SAM tag. UMI will not be copied to RX
tag if invalid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rec
|
AlignedSegment
|
The alignment record to update. |
required |
strict
|
bool
|
If |
False
|
remove_umi
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
bool
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the read name does not end with a valid UMI. |
ValueError
|
If the record already has a populated |
Source code in fgpyo/platform/illumina.py
extract_umis_from_read_name ¶
extract_umis_from_read_name(read_name: str, read_name_delimiter: str = _ILLUMINA_READ_NAME_DELIMITER, umi_delimiter: str = _ILLUMINA_UMI_DELIMITER, strict: bool = False) -> Optional[str]
Extract UMI(s) from an Illumina-style read name.
The UMI is expected to be the final component of the read name, delimited by the
read_name_delimiter. Multiple UMIs may be present, delimited by the umi_delimiter. This
delimiter will be replaced by the SAM-standard -.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
read_name
|
str
|
The read name to extract the UMI from. |
required |
read_name_delimiter
|
str
|
The delimiter separating the components of the read name. |
_ILLUMINA_READ_NAME_DELIMITER
|
umi_delimiter
|
str
|
The delimiter separating multiple UMIs. |
_ILLUMINA_UMI_DELIMITER
|
strict
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The UMI extracted from the read name, or None if no UMI was found. Multiple UMIs are |
Optional[str]
|
returned in a single string, separated by a hyphen ( |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the read name does not end with a valid UMI. |