In order to properly clean your DNA, RNA or protein sequence we need to know which alphabet the sequence is using. For instance "N" will be stripped out if you select a strict DNA alphabet, while it will remain if you select a IUPC ambiguous alphabet, where N exists and means "any nucleotide". It will also remain if you select a protein alphabet, where N means asparagine. Any character not belonging to any DNA, RNA or protein alphabet, such as punctuations, spaces, symbols, numbers and others will be always removed.
This application supports degenerated/ambiguous IUPAC characters.