The IMAGEN database contains data collected and processed by the Imagen consortium from over 2000 adolescents and their parents. It includes demographics, neuropsychological assessments, medical questionnaires, MR neuroimaging and genomics. Data have been collected over a period of 10 years in 8 recruitment centers and over 4 successive time points: baseline at age 14 (BL), follow-up 1 at age 16 (FU1), follow-up 2 at age 19 (FU2) and follow-up 3 at age 23 (FU3).

Access data and software

To access the IMAGEN database, please refer to the data access policy. After being granted access, browse data starting with the database web portal using any recent web browser or through SFTP under s

We have written specific software and scripts for data collection, quality control, processing and publishing. We have published and maintain most of recent software in GitHub.

Data release history

We have released successive versions of the Imagen dataset:

Dataset version Release date
2.6 2016-10-17
2.5 2016-04-08
2.4 2015-11-19
2.3 2015-07-02
2.2 2014-12-11

The seminal European project published data from baseline (BL) and the first follow-up (FU1) on a customized XNAT server ( We don’t have detailed versioning information for these data releases and part of the software used to manage the dataset is not available any more, being partially proprietary or too complex to be maintained and published.

For follow-ups 2 and 3 (FU2 and FU3), we moved as much as possible to open source software and scripts. Data have been released on a new server ( in well-defined versions and we have kept a log of these successive versions (numbered 2.*).


Detailed documentation is available both in the database itself and from this specific page.

Basic demographics variables

Basic demographic variables are scattered across questionnaires and other tabular data.

Since this is a longitudinal study, age cannot be associated to subjects but to specific assessments. Age in days at assessment is available in most questionnaires.

The sex of subjects is also scattered in a handful variables across tabular data. In some cases these variables are inconsistent. We have investigated these cases with precious help from recruitment centres and created a reference table (temporarily available from s

Note that participant 000015439849 moved between BL and FU2, hence data from FU2 and on have been acquired in a different acquisition centre than the initial inclusion centre.

Finally we provide a list of valid participant identifiers to help end-users detect and investigate possible identifier errors.


While the initial intent was to avoid siblings in the Imagen dataset, please take into account the following list of siblings who have made it into the dataset:

  • 000001283761, 000006000160
  • 000001939282, 000029580680
  • 000005683533, 000074247248
  • 000012988699, 000054713646
  • 000018931943, 000067854391
  • 000019767938, 000055465932
  • 000021729241, 000096466079
  • 000024686194, 000054678674
  • 000026629318, 000093535053
  • 000032899225, 000075760021
  • 000033116784, 000084828767
  • 000037821661, 000065084605
  • 000042288666, 000072198640
  • 000051387606, 000081351807
  • 000052713040, 000062252904
  • 000056673014, 000087495487
  • 000056896962, 000099550415
  • 000059625999, 000067955800
  • 000068049177, 000084838602
  • 000070675464, 000083037309