A BAM index file used in bioinformatics. It accompanies a BAM file (Binary Alignment/Map) and allows for rapid access to specific portions of the BAM file without reading through the entire dataset.
The compressed binary version of the Sequence Alignment/Map (SAM) format. It is used for storing large amounts of nucleotide sequence alignment data efficiently.
Browser Extensible Data is a text file format used to define data lines that are displayed in an annotation track within the genome browser. Each line describes a genomic region with fields including chromosome, start position, end position, and optionally, name, score, and strand.
A .bim file is part of the PLINK binary format used in genomics. It contains extended variant information accompanying a .bed binary genotype table, including fields such as chromosome code, variant identifier, genetic distance, base-pair coordinate, and allele information.
Comma-Separated Values file is a plain text file that contains data separated by commas. Each line in the file corresponds to a data record, and each record consists of fields separated by commas. CSV files are commonly used for data exchange between different applications.
general knowledge
docx
A .docx file is a Microsoft Word Open XML Format Document file. It is a zipped, XML-based file format developed by Microsoft for representing word processing documents. DOCX files can contain text, images, tables, and other document elements.
general knowledge
dose
context dependent
erate
context dependent
fam
PLINK binary format used for genomics data. It contains sample information accompanying a .bed binary genotype table, including fields such as family ID, individual ID, paternal ID, maternal ID, sex, and phenotype.
A text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. It is commonly used in bioinformatics for storing high-throughput sequencing data.
Flow Cytometry Standard file is a standard format for flow cytometry data. It contains information about the physical and chemical properties of particles or cells as they flow in a fluid stream through a beam of light.
general knowledge
geojson
A format for encoding a variety of geographic data structures using JavaScript Object Notation (JSON). It supports point, line, polygon, and multi-part collections of these types, making it useful for web-based mapping and GIS applications.
general knowledge
h5
Hierarchical Data Format version 5 (HDF5) file. It is used to store large amounts of data, including complex data types, and is commonly used in scientific computing.
general knowledge
h5ad
info
A generic file extension that can refer to various types of information files, often containing metadata or documentation related to a specific application or dataset.
general knowledge
mtx
Matrix Market file format used to represent sparse matrices. It is commonly used in scientific computing to exchange matrix data.
general knowledge
parquet
A columnar storage file format optimized for use with big data processing frameworks. It is efficient for both storage and retrieval, supporting complex nested data structures.
general knowledge
pdf
Portable Document Format file is a file format developed by Adobe that presents documents independently of software, hardware, or operating systems. It can contain text, images, and vector graphics.
general knowledge
rds
An R data file used to store a single R object. It is used in the R programming environment for saving and restoring R objects.
general knowledge
rec
This file extension is used by various applications to denote a recording file, which may contain audio, video, or other recorded data. The specific format can vary depending on the application that created it.
general knowledge
tar.gz
A compressed archive file created by combining files into a single archive using the tar command and then compressing it using Gzip compression. It is commonly used in Unix and Linux environments for file distribution and backup.
general knowledge
tbi
Tabix index file used in bioinformatics. It indexes .vcf.gz files (compressed Variant Call Format) to allow rapid retrieval of data lines overlapping regions of interest.
general knowledge
tgz
A compressed archive file equivalent to a .tar.gz file. It is created by combining files into a single archive using the tar command and then compressing it using Gzip compression.
general knowledge
tsv
Tab-Separated Values file is similar to a CSV file, but it uses tabs instead of commas to separate values. It is commonly used for representing structured data.
general knowledge
txt
Text file is a standard text document that contains unformatted text. It is recognized by any text editing or word processing program.
general knowledge
vcf
Variant Call Format file is used in bioinformatics for storing gene sequence variations. It is a text file format containing meta-information lines, a header line, and data lines.
general knowledge
xls
A Microsoft Excel file format used to store spreadsheets. It supports tabular data, formulas, charts, and more. It has been largely replaced by the .xlsx format.
general knowledge
xlsx
A Microsoft Excel Open XML Format Spreadsheet file. It is a zipped, XML-based file format used to represent spreadsheets.
general knowledge
zip
An archive file format that supports lossless data compression. It can contain one or more compressed files or directories.