flexs.utils.sequence_utils

Utility functions for manipulating sequences.

flexs.utils.sequence_utils.AAS = 'ILVAGMFYWEDQNHCRKSTP'[source]

Amino acid alphabet for proteins (length 20 - no stop codon).

Type

str

flexs.utils.sequence_utils.BA = '01'[source]

Binary alphabet ‘01’.

Type

str

flexs.utils.sequence_utils.DNAA = 'TGCA'[source]

DNA alphabet (4 base pairs).

Type

str

flexs.utils.sequence_utils.RNAA = 'UGCA'[source]

RNA alphabet (4 base pairs).

Type

str

flexs.utils.sequence_utils.construct_mutant_from_sample(pwm_sample, one_hot_base)[source]

Return one hot mutant, a utility function for some explorers.

Return type

ndarray

flexs.utils.sequence_utils.generate_random_mutant(sequence, mu, alphabet)[source]

Generate a mutant of sequence where each residue mutates with probability mu.

So the expected value of the total number of mutations is len(sequence) * mu.

Parameters
  • sequence (str) – Sequence that will be mutated from.

  • mu (float) – Probability of mutation per residue.

  • alphabet (str) – Alphabet string.

Return type

str

Returns

Mutant sequence string.

flexs.utils.sequence_utils.generate_random_sequences(length, number, alphabet)[source]

Generate random sequences of particular length.

Return type

List[str]

flexs.utils.sequence_utils.generate_single_mutants(wt, alphabet)[source]

Generate all single mutants of wt.

Return type

List[str]

flexs.utils.sequence_utils.one_hot_to_string(one_hot, alphabet)[source]

Return the sequence string representing a one-hot vector according to an alphabet.

Parameters
  • one_hot (Union[List[List[int]], ndarray]) – One-hot of shape (len(sequence), len(alphabet) representing a sequence.

  • alphabet (str) – Alphabet string (assigns each character an index).

Return type

str

Returns

Sequence string representation of one_hot.

flexs.utils.sequence_utils.string_to_one_hot(sequence, alphabet)[source]

Return the one-hot representation of a sequence string according to an alphabet.

Parameters
  • sequence (str) – Sequence string to convert to one_hot representation.

  • alphabet (str) – Alphabet string (assigns each character an index).

Return type

ndarray

Returns

One-hot numpy array of shape (len(sequence), len(alphabet)).