We have a Discord community now!
We have launched our Discord community dedicated to genetic data storage and management. Feel free to join us! Click here to join the server
We have launched our Discord community dedicated to genetic data storage and management. Feel free to join us! Click here to join the server

This article will be useful for researchers performing comparative genomics, scientists identifying clinically relevant variants, molecular biologists, students and educators seeking reference sequences.
The four main global repositories USA (GenBank), Japan (DDBJ) and Europe (ENA) exchange data daily as part of the International Nucleotide Sequence Database Collaboration. In addition to these, other major repositories such as China GSA further contribute to global data sharing.
INSDC (USA (GenBank), Japan (DDBJ) and Europe (ENA))
Data submitted to any of these databases are automatically exchanged on a daily basis, ensuring the global dissemination of sequence records across all three platforms. This systematic synchronization guarantees consistent and equivalent access to identical datasets, irrespective of the INSDC partner used for data submission or retrieval.
GenBank represents one of the largest and most comprehensive publicly accessible repositories of nucleotide sequence data worldwide. Submitted records undergo both automated and manual curation procedures aimed at ensuring data integrity, accuracy, and compliance with established formatting and annotation standards. The database is closely integrated within the broader NCBI ecosystem, encompassing resources such as PubMed, BLAST, RefSeq, dbSNP, and ClinVar. This integration facilitates a wide range of research activities, including similarity searches using BLAST, access to high-quality curated reference sequences, linkage of genomic data with the scientific literature, identification of clinically relevant variants, and seamless navigation across interconnected biological databases.
GenBank sequence records may be downloaded from the FTP site or accessed using NCBI's E-utilities API. SRA sequence records are available using the SRA Toolkit API or on Amazon Web Services (AWS) and Google Cloud Platform (GCP) clouds. SRA availability on cloud platforms enables rapid access to large datasets.
The SRA Toolkit is a collection of command-line utilities provided by NCBI for accessing, downloading, and converting sequencing data stored in the Sequence Read Archive (SRA). It is the primary toolset used by researchers to work with raw high-throughput sequencing data locally or within automated pipelines.
Submit to the world's largest public repository of biological and scientific information Submission Portal
Efficient support for next-generation sequencing (NGS) data, including large-scale datasets such as metagenomic and transcriptomic data. Its infrastructure is optimized for high-volume data deposition, making DDBJ an efficient option for projects involving extensive NGS output or requiring rapid, large-scale data submission workflows.
Services : GEA, MetaboBank, BioProject, BioSample, MSS, NSSS, AGD, JGA (Japanese Genotype-phenotype Archive), NBDC Human Database, ARSA, getentry, DDBJ Search, TXSearch, GGGenome, GGRNA, DFAST, CRISPRdirect, Gendoo, RefEx, DDBJ Core Database, DDBJ-LD, TogoVar, TogoVar-repository, VecScreen.
DDBJ Sequence Read Archive (DRA) stores raw sequencing data and alignment information to enhance reproducibility and facilitate new discoveries through data analysis. DRA is part of the International Nucleotide Sequence Database Collaboration (INSDC) and archiving data in collaboration with the NCBI Sequence Read Archive and the EBI European Nucleotide Archive.
You can download data files in formats such as FASTQ and SRA.
Create a DDBJ account and register a public key to your account. Upload data files to the submission directory on the file server.
ENA supports a broad spectrum of data types, ranging from raw next-generation sequencing (NGS) reads to assembled genomes and annotated sequences. Closely integrated with other EMBL-EBI databases (e.g., UniProt, Ensembl), enabling richer biological insights. Сonvenient for uploading data in accordance with European legislation requirements.
Access to ENA data is provided through the browser, through search tools, through large scale file download and through the API.
Providing users with the ability to download submitted data for further analysis purposes is a key part of ENA’s mission. Files are therefore made available through a public FTP server.
ENA provides to access the data it hosts, suiting a range of use-cases and computational ability levels:
Submitting and updating data Getting Started
GSA is its strong support for next-generation sequencing (NGS) data, including whole-genome, transcriptomic, epigenomic, and metagenomic datasets.
These centers offer access to specialized datasets, analytical tools, training resources, and computational infrastructure. They are particularly useful when working with population-specific datasets or region-specific research initiatives.
Public datasets can be downloaded directly from the web interface or via FTP. On each dataset page, a Download option is available.
Submitting and updating data
Australia (Bioplatforms)
South Korea (Korean Bioinformation Center)
France (France Génomique)
Switzerland (Swiss Institute of Bioinformatics)
By leveraging these global resources, the scientific community can efficiently share data, maintain data integrity, and accelerate genomic research, contributing to a more open, collaborative, and data-driven future in molecular biology and genomics.
Join our Discord community to connect with other users and get support!
Here is a comprehensive list of companies that produce DNA sequencing equipment, spanning established multinational corporations and innovative startups.
| Company | Flagship Technology/Platform |
|---|---|
| Illumina | NovaSeq, MiSeq, HiSeq |
| Thermo Fisher | Ion Torrent, Applied Biosystems |
| MGI/BGI | DNBSEQ-G50/T7/2000 |
| Oxford Nanopore | MinION, GridION |
| PacBio | Sequel, RS II |
| Element Biosciences | AVITI |
| Ultima Genomics | UG 100 |