⚠ Research Use Only

Every compound referenced on this page is discussed strictly for in vitro research and laboratory use. None are approved for human consumption, therapeutic use, or veterinary application. This database is an educational reference only.

What a Peptide Sequence Is

A peptide is a chain of amino acids linked by peptide bonds. The sequence is the ordered list of those amino acids — the most fundamental description of the molecule. Everything else about a peptide, its molecular weight, its structure, its function, follows from the sequence.

Sequences are written in a fixed direction: from the N-terminus (the amino end) on the left to the C-terminus (the carboxyl end) on the right. This is the same direction in which peptides are chemically synthesized and the universal convention in which they are read and recorded.

This page does two things. First, it provides the reference tools for working with sequences — the amino acid code table and the notation conventions. Second, it provides verified sequences for major research peptides. It is the sequence-focused companion to the Peptide Database, the Molecular Weight Database, and the Half-Life Reference.

The Amino Acid Code Table

The 20 standard amino acids each have a full name, a three-letter code, and a single-letter code. This is the foundational reference for reading any peptide sequence.

Amino Acid3-Letter1-LetterSide Chain Property
AlanineAlaANonpolar
ArginineArgRPositively charged (basic)
AsparagineAsnNPolar uncharged
Aspartic acidAspDNegatively charged (acidic)
CysteineCysCPolar (forms disulfide bonds)
Glutamic acidGluENegatively charged (acidic)
GlutamineGlnQPolar uncharged
GlycineGlyGNonpolar (smallest)
HistidineHisHPositively charged (basic)
IsoleucineIleINonpolar
LeucineLeuLNonpolar
LysineLysKPositively charged (basic)
MethionineMetMNonpolar (sulfur-containing)
PhenylalaninePheFNonpolar (aromatic)
ProlineProPNonpolar (rigid ring)
SerineSerSPolar uncharged
ThreonineThrTPolar uncharged
TryptophanTrpWNonpolar (aromatic)
TyrosineTyrYPolar (aromatic)
ValineValVNonpolar

Common non-standard residues in research peptides

Synthetic research peptides frequently contain amino acids that are not among the standard 20. These have no single-letter code and are always written out:

ResidueNameWhy It's Used
AibAlpha-aminoisobutyric acidResists enzymatic cleavage; appears in Ipamorelin, Semaglutide, Tirzepatide
D-2-NalD-2-naphthylalanineBulky synthetic residue; appears in Ipamorelin
D-TrpD-tryptophanD-form resists degradation; appears in GHRP-6
D-PheD-phenylalanineD-form resists degradation; appears in GHRP-6, GHRP-2
D-AlaD-alanineD-form resists degradation; appears in CJC-1295

How to Read a Peptide Sequence

Take the BPC-157 sequence as a worked example. In three-letter code:

BPC-157 — three-letter code Gly-Glu-Pro-Pro-Pro-Gly-Lys-Pro-Ala-Asp-Asp-Ala-Gly-Leu-Val

The same sequence in single-letter code:

BPC-157 — single-letter code G E P P P G K P A D D A G L V

Reading it: the chain starts at glycine (Gly / G) at the N-terminus on the left, proceeds through fifteen residues, and ends at valine (Val / V) at the C-terminus on the right. Both notations describe the identical molecule — the choice between them is about readability. Three-letter code is clearer when a sequence includes modified residues; single-letter code is more compact for long sequences and is standard in databases.

The rule that matters most Sequence is always read N-terminus to C-terminus, left to right. If you reverse a sequence, you describe a different molecule. The direction is not a convention you can ignore — it is part of the identity of the peptide.

Terminal & Structural Modifications

Most research peptides are not just bare chains of standard amino acids. They carry modifications — and the notation records them. The four you will encounter constantly:

Ac- — N-terminal acetylation

An Ac- prefix means an acetyl group has been added to the N-terminus. This is a stabilizing modification: it blocks one of the routes by which enzymes degrade peptides. The true TB-500 fragment is written Ac-LKKTETQ — the acetylation is part of what the molecule is.

-NH2 — C-terminal amidation

An -NH2 suffix means the C-terminus is amidated — the terminal carboxyl group is converted to an amide. Like acetylation, it improves resistance to enzymatic degradation. Ipamorelin and the GHRPs are all C-terminally amidated.

D- — D-amino acids

Naturally occurring amino acids are almost all the L-form. A D- prefix on a residue (D-Ala, D-Phe) marks the mirror-image D-form. Synthetic peptides use D-amino acids deliberately: degradation enzymes are shape-specific and do not process D-residues well, so a D-amino acid at a key position slows clearance.

Non-standard residues — Aib and others

Residues like Aib (alpha-aminoisobutyric acid) are not among the standard 20 and are written out in full. Aib is one of the most important in modern peptide design — it resists cleavage by the enzyme DPP-4, and its presence in Semaglutide and Tirzepatide is part of why those peptides have multi-day half-lives.

Why the modifications are the interesting part The amino acid backbone gives a peptide its core identity, but the modifications often determine whether it is practically useful. Acetylation, amidation, D-residues, and Aib are all answers to the same problem — native peptides are cleared too fast. Reading the modifications tells you how a peptide was engineered.

Verified Sequence Reference

Amino acid sequences for major research peptides, each verified against multiple independent published sources. Sequences are given in three-letter code with modifications noted; the residue count is the count of amino acids in the chain.

PeptideResiduesSequence (N → C)
GHK-Cu 3 Gly-His-Lys (as copper complex)
Selank 7 Thr-Lys-Pro-Arg-Pro-Gly-Pro
Semax 7 Met-Glu-His-Phe-Pro-Gly-Pro
TB-500 fragment (true Ac-LKKTETQ) 7 Ac-Leu-Lys-Lys-Thr-Glu-Thr-Gln
GHRP-6 6 His-D-Trp-Ala-Trp-D-Phe-Lys-NH2
Ipamorelin 5 Aib-His-D-2-Nal-D-Phe-Lys-NH2
BPC-157 15 Gly-Glu-Pro-Pro-Pro-Gly-Lys-Pro-Ala-Asp-Asp-Ala-Gly-Leu-Val
Sermorelin (GHRH 1-29) 29 Tyr-Ala-Asp-Ala-Ile-Phe-Thr-Asn-Ser-Tyr-Arg-Lys-Val-Leu-Gly-Gln-Leu-Ser-Ala-Arg-Lys-Leu-Leu-Gln-Asp-Ile-Met-Ser-Arg-NH2
CJC-1295 (no DAC / Mod GRF 1-29) 29 Tyr-D-Ala-Asp-Ala-Ile-Phe-Thr-Gln-Ser-Tyr-Arg-Lys-Val-Leu-Ala-Gln-Leu-Ser-Ala-Arg-Lys-Leu-Leu-Gln-Asp-Ile-Leu-Ser-Arg-NH2
Thymosin β4 (sold as TB-500) 43 Ser-Asp-Lys-Pro-Asp-Met-Ala-Glu-Ile-Glu-Lys-Phe-Asp-Lys-Ser-Lys-Leu-Lys-Lys-Thr-Glu-Thr-Gln-Glu-Lys-Asn-Pro-Leu-Pro-Ser-Lys-Glu-Thr-Ile-Glu-Gln-Glu-Lys-Gln-Ala-Gly-Glu-Ser

The same sequences in single-letter code

PeptideSingle-Letter Sequence
GHK-CuGHK
SelankTKPRPGP
SemaxMEHFPGP
TB-500 fragmentAc-LKKTETQ
BPC-157GEPPPGKPADDAGLV
Thymosin β4SDKPDMAEIEKFDKSKLKKTETQEKNPLPSKETIEQEKQAGES

Single-letter code is shown only for peptides composed entirely of standard amino acids. Peptides containing non-standard residues — Ipamorelin (Aib, D-2-Nal), GHRP-6 (D-Trp, D-Phe), CJC-1295 and Sermorelin (D-Ala, terminal modifications) — are not cleanly representable in single-letter code and are best read in three-letter form above.

Sequence and Function

The sequences above are not arbitrary — and a few patterns in them are worth noticing, because they connect sequence to behavior.

Fragments of larger molecules. Several of these peptides are sequences lifted from larger natural proteins. Sermorelin is the first 29 amino acids of growth hormone-releasing hormone, a 44-residue hormone — the "1-29" in its name is literal. The TB-500 fragment Ac-LKKTETQ is a seven-residue stretch of the 43-residue Thymosin β4. BPC-157 is a fifteen-residue partial sequence of a larger protein found in gastric juice.

Engineered variants of fragments. CJC-1295 (no DAC) is Sermorelin with a small number of amino acid substitutions — compare the two sequences in the table above and the differences are visible at specific positions. Those substitutions are why one clears in minutes and the other resists degradation longer, despite being nearly the same chain.

Proline-rich short peptides. Selank (TKPRPGP) and Semax (MEHFPGP) both end in -Pro-Gly-Pro. Proline's rigid ring makes a chain harder for enzymes to process; that shared PGP tail is a stabilizing motif, not a coincidence.

Sequence is the master key Molecular weight, structure, half-life, and function all derive from sequence. This is why a sequence database sits alongside the molecular weight and half-life references — it is the layer the others are built on.

Why This Database Verifies Sequence by Sequence

This page publishes sequences only where they have been verified against multiple independent published sources. That is a deliberate constraint, and it is worth explaining.

An amino acid sequence is an exact claim. A single wrong residue describes a different molecule. Unlike an approximate molecular weight or a half-life range — where "approximately" is honest and useful — a sequence is either exactly right or it is wrong. There is no useful approximate sequence.

For that reason, this database does not attempt to list a sequence for every one of the 45+ compounds in the broader Peptide Database. It publishes the sequences that are firmly verified, and it expands the verified set over time. Additional sequences are confirmed and published on individual compound pages, each alongside its source citations and PubChem identifier, as those pages are built.

⚠ How to verify a sequence yourself For any peptide not yet listed here, the authoritative sources are PubChem, UniProt (for sequences derived from natural proteins), and the manufacturer's certificate of analysis for the specific product. Be aware that vendor marketing pages sometimes contain transcription errors — cross-check against a primary chemical database before relying on a sequence.

Frequently Asked Questions

How is a peptide sequence written?

A peptide sequence is written from the N-terminus on the left to the C-terminus on the right — the same direction the peptide is synthesized and conventionally read. Each amino acid is represented by either a three-letter code (such as Gly for glycine) or a single-letter code (such as G for glycine). Modifications are noted at the ends, for example Ac- for an acetylated N-terminus or -NH2 for an amidated C-terminus.

What is the difference between single-letter and three-letter amino acid codes?

Both represent the same 20 standard amino acids. The three-letter code (Gly, Ala, Ser) is more readable and is standard in sequences containing modified or non-standard residues. The single-letter code (G, A, S) is compact and is standard for long sequences and database entries. BPC-157, for example, is GEPPPGKPADDAGLV in single-letter code.

What does Ac- mean at the start of a peptide sequence?

Ac- indicates that the N-terminus of the peptide is acetylated — an acetyl group has been added to the start of the chain. This is a common modification that increases stability against enzymatic degradation. The true TB-500 fragment, for example, is written Ac-LKKTETQ because its N-terminus is acetylated.

What is a D-amino acid in a peptide sequence?

Most naturally occurring amino acids are the L-form. A D-amino acid is the mirror-image form. Synthetic peptides often incorporate D-amino acids deliberately because enzymes that degrade peptides do not recognize them well, which increases stability. In a sequence, a D-amino acid is marked, for example D-Phe or D-Ala. CJC-1295 contains a D-Ala at position 2.

What does -NH2 at the end of a peptide sequence mean?

-NH2 indicates that the C-terminus of the peptide is amidated — the terminal carboxyl group has been converted to an amide. Like N-terminal acetylation, C-terminal amidation is a common modification that improves resistance to enzymatic degradation. Many research peptides, including Ipamorelin and GHRP-6, are C-terminally amidated.

What is Aib in a peptide sequence?

Aib stands for alpha-aminoisobutyric acid, a non-standard, non-coded amino acid. It is not one of the 20 standard amino acids and has no single-letter code in standard notation. It is used in synthetic peptides because it resists enzymatic cleavage — it appears in Ipamorelin and in the GLP-1 receptor agonists Semaglutide and Tirzepatide, where it contributes to their long half-lives.

Why isn't every peptide's sequence listed here?

A sequence is an exact claim — one wrong residue describes a different molecule, and there is no such thing as a useful "approximate" sequence. This database publishes only sequences verified against multiple independent published sources, and expands that set over time. Sequences for other compounds are verified and published on individual compound pages with their source citations.

Is the sequence the same for a peptide regardless of manufacturer?

Yes — the sequence defines the molecule, so a correctly manufactured peptide has the same sequence regardless of supplier. What can differ between manufacturers is purity, salt form, and the presence of synthesis-related impurities, none of which change the target sequence. Those quality attributes are documented on the certificate of analysis.

⚠ Research Use Only

All compounds referenced in this database are discussed strictly for in vitro research and laboratory use. None are approved by the FDA for human consumption, therapeutic use, or veterinary application. This database is provided for educational and reference purposes only and does not constitute medical advice.

Sequences are published only where verified against multiple independent published sources. For any precise work, confirm the sequence against a primary chemical database (PubChem, UniProt) and the certificate of analysis for the specific product.