What is raw DNA analysis?

Raw DNA analysis takes the unprocessed genotyping data from a consumer DNA test (23andMe, AncestryDNA, MyHeritage, FamilyTreeDNA) and interprets it against published clinical guidelines like CPIC. It identifies variants in pharmacogenomic, nutrition, and other health-relevant genes that your original testing company does not report on. It is educational — not a clinical diagnostic.

Can I get pharmacogenomic results from 23andMe raw data?

Yes, for most users. 23andMe raw data includes SNPs in the main CPIC-actionable pharmacogenes: CYP2D6, CYP2C19, CYP2C9, CYP3A5, TPMT, DPYD, VKORC1, and SLCO1B1. Coverage is partial compared to clinical PGx panels — consumer arrays cannot detect rare variants, gene deletions, or duplications — but the common variants that drive most drug-response phenotypes are genotyped.

How much does raw DNA analysis cost?

Decode+ is free to start; unlock all your results with a one-time $59 you own for life. Clinical pharmacogenomic testing (GeneSight, Genomind, OneOme) costs $250–$2,000+ — reusing your existing raw DNA is far cheaper.

Which consumer DNA tests work with raw DNA analysis?

DecodeMyBio accepts raw data from 23andMe (all versions), AncestryDNA (v2+), MyHeritage, FamilyTreeDNA (Family Finder), LivingDNA, and tellmeGen. We also accept VCF files from whole-genome sequencing services like Nebula Genomics. Upload formats include .txt, .csv, .zip, and .vcf.gz — no pre-processing required.

Can I use raw DNA data to replace clinical pharmacogenomic testing?

For informational purposes, yes. For high-stakes clinical decisions (e.g. starting warfarin, clopidogrel, or tamoxifen in complex cases), a provider-ordered clinical PGx test is more appropriate because it covers rare variants and copy-number changes that consumer arrays miss. Treat raw DNA analysis as a first-look that can inform conversations with your prescriber.

Does the raw DNA file need to be recent?

No. Your genome does not change. Raw DNA files from 23andMe v3 (2010), v4 (2013), v5 (2017), AncestryDNA v1 (2012), v2 (2016) are all valid — newer chips simply cover more pharmacogenomic markers. DecodeMyBio shows confidence levels based on how many markers your file contains.

What gene coverage should I expect from my file?

23andMe v5 users typically get 15–25 pharmacogenomic markers across the major CPIC genes. AncestryDNA v2+ files typically include 7–15 markers. MyHeritage files vary. DecodeMyBio displays a "coverage level" (high / moderate / limited) so you know what to expect before you decode.

Raw DNA Analysis: The Complete 2026 Guide

10 min read · Updated April 2026 · DecodeMyBio Editorial Team

Raw DNA analysis takes the unprocessed genotyping file from a consumer DNA test — 23andMe, AncestryDNA, MyHeritage, FamilyTreeDNA — and interprets it against clinical guidelines like CPIC. It reveals how your body processes medications, metabolizes nutrients, and responds to compounds like caffeine, alcohol, and THC. It is educational; free to start with Decode+, and requires no new test.

What “Raw DNA Data” Actually Is

When you take a 23andMe or AncestryDNA test, the lab genotypes a predefined set of positions in your genome — typically around 600,000 to 700,000 single-nucleotide polymorphisms (SNPs). The company uses a subset of those to generate the ancestry breakdown, trait predictions, and health reports you see in your account dashboard. The remainder sits unused.

Your raw data file is the full list of those SNP calls — each line shows one position in the genome (identified by rsID, chromosome, and base-pair position) and the two letters your genome has at that position. It is roughly 15–25 MB as a plain text file, usually delivered as a .zip you can download from your account settings.

What makes raw data powerful is that it contains variants in every major pharmacogenomic gene — not just the handful that your testing company chose to report on. Consumer testing companies focus on ancestry and a curated set of wellness traits because that is their product. But the biological information in your file goes far beyond what they show you.

Raw DNA analysis services — DecodeMyBio and others — parse that file, identify the clinically-meaningful variants, and map them to published clinical guidelines so you can see how your genetics affect real medications, nutrition, and daily life.

Which Consumer DNA Tests Let You Download Raw Data?

Most major consumer testing services provide raw data exports. Coverage and chip technology vary — so the number of pharmacogenomic markers in your file depends on which service you used and which version of their chip was used for your sample.

Service	Raw data download?	Typical PGx markers	File format
23andMe (v3–v5)	Yes — free, 5-min export	15–25	.txt inside .zip
AncestryDNA (v1–v2+)	Yes — free, 5-min export	7–15	.txt inside .zip
MyHeritage DNA	Yes — free	10–20	.csv
FamilyTreeDNA (Family Finder)	Yes — free	10–18	.csv
LivingDNA / tellmeGen	Yes — free	Varies	.txt / .csv
Nebula Genomics (WGS)	Yes	All variants	.vcf.gz
Helix	No standard export	—	—

If you already have a file from any of the first six providers above, you can run raw DNA analysis today. If you are deciding between tests to take now and plan to reuse the data for pharmacogenomics later, 23andMe v5 gives the widest PGx marker coverage for the money. For a deeper breakdown, see our AncestryDNA guide.

How to Download Your Raw DNA from 23andMe

Sign in to your 23andMe account at 23andme.com.
Open Settings — click your profile icon top-right and select “Settings”.
Find Browse Raw Data at the bottom of the Settings page, under the “23andMe Data” section, then click the “Download” tab.
Confirm and request — re-enter your password, answer the security question, and click “Submit Request”. 23andMe emails the link within about 5 minutes.
Open the email, click the download link and save the .zip file. The file name starts with genome_ and ends in .zip. You do not need to unzip it before uploading.

How to Download Your Raw DNA from AncestryDNA

Sign in to ancestry.com.
Open DNA settings — hover over “DNA” in the top navigation, click “Your DNA Results Summary”, then “Settings” in the upper right.
Find “Download your raw DNA data” and click “Download”.
Confirm — enter your password and agree to the download terms.
Open the confirmation email and download the .zip file that contains your raw .txt data.

MyHeritage and FamilyTreeDNA follow a similar flow — look for “Manage DNA kits” or “Raw data” in your account settings. Most downloads are available in under a minute after you confirm.

What Raw DNA Analysis Can Tell You

Consumer genotyping arrays cover most of the common variants with clinical evidence in the main CPIC-actionable genes. Specifically, a typical 23andMe or AncestryDNA raw data file contains interpretable information about:

Drug metabolism — how quickly or slowly your body processes roughly 150 common medications, including SSRIs, tricyclics, opioids, proton-pump inhibitors, and blood thinners. See our psychiatric medication insights.
Nutrient metabolism — MTHFR, COMT, VDR, BCMO1 variants affecting folate, vitamin D, beta-carotene and methylation. See the nutrition insights.
Pain and anesthesia response — CYP2D6 and CYP2C9 variants affecting codeine, tramadol, and NSAID response. See the pain insights.
Cannabis and THC metabolism — CYP2C9, AKT1, CNR1 variants that affect how edibles and THC-containing products hit you. See the cannabis insights.
Celiac and gluten sensitivity — HLA-DQ2 and HLA-DQ8 haplotype screening. See the celiac insights.
Common traits — caffeine metabolism, lactose tolerance, alcohol flush, bitter taste, earwax type, and dozens more — all part of Decode+.

Main Pharmacogenomic Genes Detectable in Consumer Raw Data

These are the genes with the strongest CPIC-level evidence where consumer arrays reliably capture the common phenotype-driving variants. Click any gene name for the deep-dive page with phenotype tables and medication interactions.

Gene	Affects	Consumer array coverage
CYP2D6	Codeine, tramadol, SSRIs, tricyclics, tamoxifen, atomoxetine	Common SNPs covered; copy-number changes not detected
CYP2C19	Clopidogrel, omeprazole, pantoprazole, escitalopram, citalopram	Excellent — main star alleles covered
CYP2C9	Warfarin, NSAIDs, siponimod, phenytoin	Good — 2, 3 variants covered
VKORC1	Warfarin sensitivity	Yes
SLCO1B1	Statins (simvastatin, rosuvastatin) myopathy risk	Yes
MTHFR	Folate methylation (not CPIC actionable for drugs)	Yes — C677T and A1298C

What Raw DNA Analysis Cannot Tell You

Being honest about what consumer arrays cannot do is as important as knowing what they can. Four hard limitations apply to every raw DNA analysis service:

Gene deletions and duplications — CYP2D6 in particular is known for copy-number variants (CYP2D6*5 deletion, CYP2D6*1xN duplications). Consumer SNP arrays cannot detect these. Clinical pharmacogenomic labs use targeted assays that can.
Rare variants — if a variant affects fewer than ~1% of people, it likely is not on the array. This matters most for CYP2D6 (dozens of low-frequency but clinically significant alleles) and HLA-B (hundreds of variants relevant to abacavir, carbamazepine, allopurinol).
Non-genetic factors — drug response depends on age, organ function, other medications (phenoconversion), diet, and health conditions. Genetics is one input, not the whole picture.
Diagnostic conclusions — raw DNA analysis is educational and not an FDA-cleared clinical diagnostic. Prescribing decisions should always involve a clinician. For high-stakes scenarios, a clinical PGx test is the right tool.

See our full limitations page for the complete breakdown.

Raw DNA Analysis vs Clinical Pharmacogenomic Testing

Both approaches produce pharmacogenomic insights. The trade-offs are cost, coverage, and turnaround. For most people exploring PGx for the first time or comparing their situation against published CPIC guidelines, raw DNA analysis is enough. For high-stakes clinical decisions, a provider-ordered clinical test is the right call.

Dimension	Raw DNA Reuse (DecodeMyBio)	Clinical PGx (GeneSight, Genomind)
Cost	Free to start; $59 one-time Decode	$330–$2,000 self-pay
New sample needed	No — uses existing data	Yes — cheek swab
Turnaround	Minutes	2–4 weeks
Prescriber required	No	Yes
Gene-deletion / duplication detection	No	Yes
Rare-variant detection	Limited	Yes
Insurance coverage	No	Often (especially Medicare Part B)
Regulatory status	Educational (not FDA-cleared)	CLIA-certified lab / LDT

For the full pricing breakdown, see our pharmacogenomic testing cost guide and our GeneSight cost breakdown.

How DecodeMyBio Analyzes Your Raw Data

Once you upload, we do four things:

Parse the file — we extract every SNP call, map rsIDs to chromosomal positions, and build a variant table specific to your file's chip version.
Call star alleles — for pharmacogenes like CYP2D6 and CYP2C19 we infer the star-allele haplotype (e.g. *1/*2) from the combination of SNPs in your file, using published PGx haplotype definitions.
Assign phenotypes — each diplotype maps to a metabolizer phenotype (poor, intermediate, normal, rapid, ultra-rapid) per CPIC allele function tables.
Generate CPIC-aligned recommendations — each drug-gene pair is scored against the current CPIC guideline. Decode+ shows you the gene, phenotype, drug, and clinical action.

Your results are reviewed for internal consistency and laid out for you inside Decode+ — interactive, searchable, and updated as guidelines change. See our full methodology page for the technical detail and our data sources page for the references we use.

What You'll See Inside Decode+

Decode+ is designed to be usable at two levels — quick enough to scan in 5 minutes and detailed enough to dig into gene by gene. Your medication-safety results include:

Risk snapshot — an at-a-glance summary of the most significant findings in your file.
Medication checklist — 48 named medications across common therapeutic classes, each flagged with a clinical action (standard, adjust, avoid).
Phenotype detail — star allele, metabolizer status, and clinical meaning for each analyzed gene.
CPIC-cited evidence — every finding linked to the published guideline it's based on, so you can take it into a conversation with your clinician.

Preview a full sample at our See a sample — no account or upload needed.

Raw DNA Analysis Pricing

DecodeMyBio is free to start: create an account, upload your raw DNA file, and see your genome overview at no cost. To unlock every result — pharmacogenomics, nutrition, cannabis, pain, celiac, and traits — unlock everything with a one-time Decode:

$59

one-time · yours for life

Full access to every result. No subscription.

Get Started Free

Compare to clinical PGx panels — which cost $330–$2,000 per test — in the GeneSight vs Genomind guide.

Frequently Asked Questions

Is raw DNA analysis accurate?

For the variants it can see, yes — consumer array accuracy is typically 99%+ per SNP. The limitation is coverage, not precision. Consumer arrays test ~600,000–700,000 positions out of 3+ billion in the genome; within that set, calls are reliable.

Can I use raw DNA analysis to skip clinical PGx testing?

For informational and educational purposes, yes. For high-stakes prescribing decisions (starting warfarin, tamoxifen, clopidogrel with kidney disease, complex anesthesia), a provider-ordered clinical test is more appropriate because it covers rare variants and copy-number changes.

What if my raw data file is old?

Your genome does not change. 23andMe v3 (2010), v4 (2013), and v5 (2017) files are all valid. Newer chips just cover more pharmacogenomic markers, so a 2013 file may have fewer markers than a 2020 file — but what is there is still accurate.

What happens to my data after analysis?

Files are encrypted at rest and in transit. Your data stays under your control — delete it anytime from your account, or contact support for an export. We never sell or share it with third parties. See privacy for full detail.

What if I do not have raw DNA data yet?

Take an inexpensive consumer test first — 23andMe and AncestryDNA start around $99. Combined with a one-time $59 Decode, you are still well under clinical PGx pricing.

How long does analysis take?

File parsing and variant identification run in minutes. Your results are ready in Decode+ within minutes of uploading — no PDF, no waiting on email.

Understanding Star Alleles and Metabolizer Phenotypes

Pharmacogenomics describes your genotype using star alleles — notation like CYP2D6*1/*4 or CYP2C19*2/*17. This shorthand exists because most pharmacogenes have specific SNP combinations (haplotypes) that recur in populations. Each haplotype gets a star-number name (*1 is typically the reference, *2 onwards are the defined variants), and your two copies — one inherited from each parent — form your diplotype.

Each star allele has an activity score defined by CPIC based on in vitro and in vivo data (0 = no function, 0.5 = decreased, 1 = normal, >1 = increased). The sum of your two activity scores maps to a phenotype:

Poor metabolizer (PM) — activity score 0. Little or no enzyme activity. Standard doses of drugs metabolized by this enzyme can reach toxic plasma levels; prodrugs (codeine, tamoxifen) fail to activate.
Intermediate metabolizer (IM) — activity score 0.25–1.0. Reduced but not absent enzyme activity; dose adjustments may be warranted.
Normal metabolizer (NM) — activity score 1.25–2.25. Standard dosing applies.
Rapid metabolizer (RM) — activity score 2.25–3.0. Faster than normal breakdown; some drugs may be under-exposed.
Ultra-rapid metabolizer (UM) — activity score >3.0. Typically from CYP2D6 gene duplications. Prodrugs may reach toxic levels of active metabolite (e.g. codeine → morphine). Not detectable from consumer arrays because duplications are structural variants.

Learn more about what poor metabolizer means and see the CYP2D6 deep-dive for a fully worked example.

The Science: How Consumer SNP Arrays Actually Work

Consumer DNA tests use a technology called SNP microarray genotyping. It is different from whole-genome sequencing in an important way: it only tests a specific, predetermined set of positions in your genome, not the full sequence. Each of those positions is a single-nucleotide polymorphism — a spot where the population has known variability (e.g. some people have an A, others have a G at a given position).

A typical 23andMe v5 chip (Illumina Global Screening Array) tests about 640,000 SNP positions. That sounds like a lot, but the human genome has over 3 billion base pairs. The chip is roughly sampling 0.02% of your genome — but it is sampling the positions researchers have identified as the most informative for ancestry, disease risk, and pharmacogenomics.

For each SNP, the chip produces a genotype call in one of three forms: homozygous reference (e.g. AA), heterozygous (AG), or homozygous alternate (GG). Call accuracy is typically 99%+ per SNP for common variants. The limitations are:

Array content is fixed — if a SNP is not on the chip, it is not in your file. Chip versions vary across years: 23andMe v3 (2010), v4 (2013), v5 (2017) progressively expanded pharmacogenomic coverage.
Structural variants are invisible — deletions, duplications, and large rearrangements (critical for CYP2D6) do not show up. You see individual SNP calls but not the architecture of the gene.
Rare variants are underrepresented — array content skews toward population-common variants. Very rare but clinically important alleles (especially relevant for HLA typing, CYP2D6 sub-alleles) often are not covered.
Ethnic representation varies — chips were historically designed with European-ancestry populations. Variants more common in African, East Asian, or Indigenous populations are improving with newer chips but coverage is still uneven.

For most pharmacogenomic applications this is acceptable. The major CPIC-actionable phenotypes in CYP2C19, CYP2C9, VKORC1, SLCO1B1, and CYP2D6 (excluding copy-number) are driven by SNPs that are covered.

Phenoconversion: When Drug Interactions Override Your Genetics

Your genetic metabolizer phenotype is what your genes predict — but what happens in your body is modified by everything else you are taking. Phenoconversion is the phenomenon where a drug interaction changes your effective phenotype, often dramatically.

A classic example: fluoxetine, paroxetine, and bupropion are strong CYP2D6 inhibitors. A person with a genetic CYP2D6 normal-metabolizer genotype, if taking one of these, behaves functionally as a CYP2D6 poor metabolizer. This matters for any drug metabolized by CYP2D6 — codeine will not convert efficiently to morphine, tamoxifen will not form endoxifen, tricyclics will accumulate.

Raw DNA analysis reports your genetic phenotype. It does not know what other medications you are on. If you are taking a strong inhibitor or inducer of a pharmacogene, your effective phenotype may be shifted one or two categories — a shift a clinician should factor into prescribing decisions. See our limitations page for the list of common CYP inhibitors and inducers to watch for.

Common Misconceptions About Raw DNA Analysis

“Consumer DNA tests are not accurate enough for medical use.”

Per-SNP accuracy on a modern 23andMe or AncestryDNA chip is typically 99%+ for common variants. The limitation is coverage (what the chip can see), not accuracy (whether calls are correct). For the variants that drive most CPIC-actionable phenotypes, consumer arrays are reliable.

“I already have 23andMe health reports — I don't need anything else.”

23andMe reports on a curated subset of variants in a limited set of conditions. The pharmacogenomic variants in your file — CYP2D6, CYP2C19, CYP2C9, VKORC1, SLCO1B1 — are largely not part of the 23andMe health section. Raw data analysis reveals what your testing company chose not to report on.

“If I have an MTHFR variant, I need special supplements.”

The supplement industry heavily markets to people with MTHFR variants, but the CDC, ACOG, and AAFP all recommend the same 400 mcg folic acid dose regardless of MTHFR status. Evidence for methylfolate superiority in the general population is weak.

“A normal metabolizer result means I can ignore PGx.”

“Normal” at one gene does not mean normal at all relevant genes. A CYP2D6 normal metabolizer who is also a CYP2C19 poor metabolizer still has significant drug response implications for omeprazole, clopidogrel, and escitalopram. Pharmacogenomics is multi-gene.

Consumer Arrays vs Whole Genome Sequencing (WGS)

If you are deciding whether to take a consumer DNA test (array-based) or invest in whole genome sequencing, here is the practical comparison for pharmacogenomic purposes:

Consumer arrays (23andMe, AncestryDNA) cost $79–$199 and test ~640,000–900,000 specific SNPs. For pharmacogenomics, they cover the common variants that drive 90–95% of CPIC-actionable phenotypes in major European-ancestry populations. Call accuracy is excellent. Limitation: no copy-number, limited rare variants.

Whole genome sequencing (Nebula, Dante Labs) costs $200–$500+ for 30× coverage and captures essentially every base in your genome. For pharmacogenomics it covers all known variants including rare ones. It also produces a VCF file that raw-data analysis services can process. Limitation: cost, data management complexity, and analytical overkill for people who only want PGx insights.

Practical take: for pharmacogenomic insights specifically, a consumer array plus raw-data reuse is the cost-effective path. WGS is worth it if you want comprehensive health-risk analysis or plan to do ongoing research with your genome. DecodeMyBio accepts VCF files from WGS services, so you are not locked out either way.

Who Benefits Most (and Least) from Raw DNA Analysis

Raw DNA analysis is an informational tool, not a one-size-fits-all recommendation. Here is who tends to get the most value:

High value:

People who have tried multiple antidepressants without success — pharmacogenomic data often explains why.
People who experience unusually strong side effects at normal doses of common drugs (codeine, tramadol, certain statins, PPIs).
People starting a medication with a narrow therapeutic index (warfarin, phenytoin, tacrolimus) who want to discuss baseline genetic risk with their prescriber.
People on multiple medications who want a written record to review for potential drug-gene interactions.
People who already have a 23andMe or AncestryDNA file and want to get more value from it.

Lower value:

People facing an immediate high-stakes prescribing decision (e.g. complex oncology dosing) — a provider-ordered clinical PGx test that also detects copy-number and rare variants is more appropriate.
People looking for a broad disease-risk screen — raw DNA analysis focuses on drug-gene interactions, not polygenic disease risk.
People without existing consumer DNA data who are not willing to take a consumer test first — starting with a provider-ordered clinical PGx test through insurance may be cheaper.

Our methodology page details exactly which genes and variants we analyze, and our limitations page is honest about what we cannot tell you.

Ready to Decode Your DNA?

Create your account and upload your 23andMe, AncestryDNA, MyHeritage, or FamilyTreeDNA file in about two minutes. See your genome overview free, then unlock every result with Decode+.

Get Started Free See a sample