Memorandum by Oxagen Ltd
Oxagen is a Clinical Genomics Company investigating
the genetic basis of common human disease to discover new diagnostics
and better targeted therapies. Founded in 1997 as a spin-out from
the University of Oxford, the Company now has 75 employees, and
is based on Milton Park near Abingdon, Oxon (see Attachment 1
for more details).
What current projects involve collecting genetic
information on people in the UK?
We are running nine research programmes searching
for genes in which variants can lead to increased susceptibility
to human disease. These are all common diseases, which have a
complex underlying aetiology involving multiple genetic determinants,
a large environmental component and chance. The diseases we are
studying are: heart disease, asthma, inflammatory bowel disease
(Crohn's disease and ulcerative colitis), osteoporosis, endometriosis,
type two diabetes, polycystic ovary syndrome, autoimmune thyroid
disease and psoriasis.
The projects rely on assembling large collections
of families, typically with two or more affected siblings. These
are identified by our clinical collaborators through clinics or
retrospectively by examination of their own databases. Blood is
collected (typically three x 10 ml), and we prepare DNA for genetic
analysis. The DNA is genotyped for 400 genetic markerschromosomal
"sign posts", and the data is analysed to reveal trends
of co-inheritance between particular markers and disease. This
narrows the search for the genes involved to a small number of
chromosomal regions, typically around 30 million base pairs in
length (about 1 per cent of the genome). These regions are then
refined by typing the same and additional samples using more densely
spaced markers to identify smaller regions of "association".
The goal here is to identify DNA sequences that are more prevalent
in affected individuals than in the general population. Detecting
such sequences narrows the location of the disease gene to less
than 300,000 base pairs. The final step is to identify all the
genes and common variations in DNA sequence in the region, and
to identify those sequence variations that affect gene function.
It is only after this final stage that one can claim to have identified
a "disease gene".
All of this informationon pedigrees,
clinical status and genotypeis kept on secure databases
within the Company. Each sample is identified only through a reference
number and the Company has no way of matching individual test
results with the identity or address of a participant. Also, Oxagen's
collaborators only provide clinical data relevant to the disease
under investigationthe Company does not receive a copy
of the patient's medical notes and can only gain access to them
under medical supervision for verification purposes.
What other projects are about to start?
We are planning to extend our autoimmune and
inflammation programmes to examine arthritis. We are also considering
participation in a European initiative to study the genetics of
lung cancer susceptibility.
We are interested in the idea of collecting
a very large (around 1 million) prospective population sample
that would be used to validate the diagnostic potential of genetic
tests for disease susceptibility, and to examine gene-gene and
gene environment interactions. The health of individuals would
be followed over a long period of time, allowing medical events
to be correlated with particular genotypes. This would be a huge
project, necessarily involving a number of interested parties,
including the NHS and the major sources of medical research funding
such as the MRC and the Wellcome Trust. The idea of such a study
has already been floated publicly by others.
Are there collections of material (eg tissue samples)
that could be used to generate databases of DNA profiles?
This is not an area we are involved in. Our
feeling is that existing tissue banks are not particularly useful
in the study of disease genetics because the DNA would be of erratic
quality and quantity, and there might be insufficient data on
the source of the sample. An important exception to this is in
the field of cancer research, where archived tissue samples are
very suitable for the study of the somatic mutation events that
occur during tumourogenesis.
Why are these genetic databases being assembled?
All commentators agree that the completion of
the human genome sequence is just the start of a massive enterprise
to catalogue and ascribe functions to all the genes. Beyond this,
the challenge is to understand how small sequence differences
in the genes (known as single nucleotide polymorphisms or SNPs)
influence gene function and hence disease susceptibility. We are
well placed in the UK to capture much of the commercial opportunity
that will flow from this next phase, in particular, the study
of genetic diversity. Some of this will be carried out in computers,
but hypotheses generated in silico will need to be verified
in vitro and in vivo using increasingly sophisticated
model systemsthe whole panoply of functional genomics.
Our belief is that gene function will eventually have to be studied
in man, and that the study of how genetic variation influences
disease susceptibility and progression is the ultimate in functional
Genetic databases are essential if we are to
piece together the complex jigsaw of common disease susceptibility.
Our goal is to gain new insights into the molecular mechanisms
underlying disease on which radical new treatments can be based.
Other benefits will flow from our ability to classify disease
more preciselydisease stratificationand to identify
those most at risk from disease allowing us to target life-style
advice, screening and preventative intervention.
How are these activities funded?
We largely fund our own programmes, though a
number receive an element of governmental support through grants
such as the LINK scheme. A number of projects start as shared
risk initiatives with academics, in which we contribute genotyping
resources in return for an option to take the study forward should
the results look encouraging.
What practical considerations will constrain developments?
Our ability to assemble the large clinical cohorts,
including affected and unaffected individuals, is essential if
we are to identify the genes underlying common disease. This is
already a costly and time-consuming activity, and the impact of
any additional safeguards on the use of patient data would have
to be carefully assessed to ensure that it did not render large-scale
genetic and epidemiological studies impractical. The UK has the
opportunity to be a world-leading centre for such studies. Of
particular concern to us would be restrictions on the use by clinicians
of hospital records and databases to identify potential study
participantsor a demand for individuals to have access
to the test results.
Are there alternative ways of fulfilling the objectives?
Genetics is probably the most powerful tool
for studying the operation of complex biological networks. This
was shown originally for bacteria and viruses, but can now be
applied to more complex organisms ranging from yeast through the
nematode and fruit fly to vertebrates such as mice. With man,
no one would want to carry out germ-line modification to test
genetic theories. However, we can still study genetics in man
by a study of existing variation. This ranges from the study of
rare single gene disorderswhich are essentially a human
equivalent of knock-out mice (transgenic mice engineered to lack
a particular gene), to the common diseases that are the major
causes of human morbidity and that touch all our lives.
Any study of human genetics inevitably calls
for a correlation of genotype with phenotype (ie the effect on
the individual) on a large scale, and there is therefore no realistic
alternative to the use of genetic databases in some form.
What is the genetic information that is being
This ranges from simple statements of a family
history of disease, through large assemblies of anonymous genetic
marker information (DNA signposts with no impact on disease such
as microsatellite sequences or single nucleotide sequences in
intergenic regions) to specific genotypes which are known to affect
gene function or influence disease susceptibility. Of course what
is thought of as an anonymous marker might at some time in the
future turn out to be an important determinant of disease. However,
the overwhelming majority of genetic variation examined will be
essentially neutral in effect, and will turn out to have no bearing
on disease. One important ethical issue is what to do with information,
initially thought to be of no significance, that subsequently
turns out to have major implications for an individual or family.
We view this as a judgement that we should not try to make, rather
we would pass the data to the relevant supervisory clinician and
let them make the judgement.
Oxagen has developed and published its policies
in respect of ethics and sample collection procedures (see Attachments
2 and 3). [Not printed]
How is it being stored and protected?
In our Company, the genotype and phenotype data
are collected into centralised databases on a secure server with
access limited to certain Company employees. Collaborating clinicians
and commercial partners are not given access to company systems
directly or indirectly (eg via a VPN). The Company operates to
a high standard of IT security to prevent illicit access to data
from internal or external sources and is isolated from the Internet
by a firewall device. The Company also prohibits the use of laptop
computers for analysis and management of clinical datasets to
reduce the risk of exposure due to theft. However, the ultimate
guarantee against misuse of such information is that we do not
have access to any patient names or addressesindividuals
are tracked with internally generated unique identifiers. We do
not operate irreversible anonymisation of the data, however, as
we need to provide for re-bleeds and clarification of inconsistencies
in phenotype data. This is allowed for by sending the relevant
unique identifier to the collaborating clinician.
How do the organisations involved see their responsibilities
regarding privacy, consent, future use, public accountability
and intellectual property rights?
We take our responsibilities with regard to
privacy extremely seriously. All our studies involve fully informed
consent, with the use of the samples restricted to the field defined
in the consent form (see Attachment 4 for an example from our
thyroid disease programme). The clinical materials and data are
collected through collaborating clinicians who act as guarantors
of the patients' interests. An additional important principle
is that we do not seek to own the patient samples, rather we leave
ownership in the hands of the institution where they were collected.
Our agreements provide for an exclusive period of commercial access
to the samples which is typically three to five years. The samples
therefore remain in the public domain, with access to other academic
researchers controlled by a Research Steering Committee.
We do seek to retain the intellectual property
rights in discoveries made through the use of the samples, though
our agreements provide for a percentage of any returns, including
milestone payments and royalties, payable to the collaborating
institutions. None or our agreements provide for payments to individual
volunteers (other than expenses).
How do they see their activities in the area of
genetic databases developing in the future?
The scale of our genetic databases will grow
exponentially, with the possibility emerging of selling access
to the database to allow genotype/phenotype correlation to be
carried out in the computer. This would be restricted to a subset
of individuals who had been suitably consented.
What advances in sequencing, screening and database
technology are they anticipating?
Our ability to identify genetic variation is
dramatically outpacing our ability to assign function to genes
and SNPs. We are already in a position where there are in excess
of two million common known polymorphisms due to the efforts of
the SNP Consortium and Celera Corporation. The next wave of technological
developments will allow us to genotype many thousands if not all
of these, potentially across thousands of selected individuals.
This will generate vast quantities of data and present huge problems
in terms of analysis. The main impact of the technology will come
from selecting a much smaller subset of functional and disease
associated SNPs, and typing these in a more targeted fashion.
Genotyping individuals for a set of these important SNPs will
allow us to assign them to groups with elevated risk of certain
diseases, allowing preventative measures to be taken and/or frequent
screening to be used to capture disease in its early, more treatable
What lessons should be learnt from genetic database
initiatives in other countries?
The initiative taken by Decode in Iceland leads
us to the conclusion that whilst the large-scale database approach
is feasible, the sensitivity surrounding a perceived monopoly
of access means that it would have to be implemented as a public/private
partnership in the UK.
All of our proposals for studies are subject
to rigorous examination by the relevant local and multi-centre
committees. We would welcome an initiative to provide a set of
guidelines to ensure consistency of approach between committees
at different centres.
The wider use and standardisation of databases
in all areas of the NHS would greatly facilitate genetic research.
Ultimately we see great benefit accruing from the computerisation
of all medical records. Access to such records by commercial concerns
should be possible under suitable safeguards such as restrictions
on use and anonymisation of data.
The attitude of the insurance industry is critical
in the development of genetic research. Creative incentives to
the positive use of genetic and other risk factor data need to
be found. The spectre of genetic discrimination needs to be avoided
at all costs, as it could undermine the basis of shared endeavour
between the research community, patients and their families.
It is important to realise that a fundamental
distinction can be drawn between genetic mutations that are deterministic
in terms of disease outcome (single gene disorders such as thalassemia,
muscular dystrophy and cystic fibrosis), and the polymorphisms
that influence susceptibility to common ailments such as asthma
and heart disease. Whilst the former allow us to make accurate
predictions of individual risk, the common disease variations
do not. Thus an individual carrying a risk allele for Alzheimer's
disease might belong to a group with an increased risk of developing
the disease, but themselves be at low risk because of other as
yet unidentified genetic factors. There is therefore limited utility
in genetic databases other than in a research setting.
Attachments: [Not printed]
1. Questions and Answers on genomics and
2. Oxagen's policy on sample collection.
3. Oxagen's policy on the ethics of genetic testing.
4. Sample informed Consent form.
Dr Mark Edwards
4 October 2000