" ?>

Appendix 3: Genealogy Pro Data Storage Format

This appendix is designed for users with a programming background.

This chapter assumes that you have a basic understanding of the Unix Shell and a moderate understanding of Structured Query Language (SQL).

This section makes use of the Sqlite3 executable that is packaged with MacOS X starting with MacOS X system 10.4. If you are running MacOS X with a system prior to system 10.4 it is available as a free download.


File Wrapper Structure

Genealogy Pro documents are File Wrappers - that is, they are directories that are flagged by MacOS X to be presented to the user as a file. The listing in the image below shows the contents of the directory.

All data is stored in the .Contents/Resources/ directory and comprises:

  • 'charts' directory - this is where charts are stored.
  • 'document preferences' - this is a property list file of preferences that apply to the document
  • 'genealogy.database' - this is an SQLite 3 database file that contains all genealogy data




Database Tables

The image below shows the list of tables in the database when querying the database with the sqlite3 executable.




The persons and families Table

The tables that store family information are the 'persons' and the 'families' table.

The image below shows a listing of the persons table. The columns in the table are:

  • persons - an unique integer that identifies this person. (This value is used as the GEDCOM id when exporting to GEDCOM, except it is prefixed with an 'I').

    note: This is a convention in the genealogy.database tables. The first column is the same name as the database and is the primary key identifier for a row in the table.

  • par_family - this identifies the family in the 'families' tables
  • sp_families - this is a bracketed list of identifiers of family in the 'families' table.
  • name - integer that identifies the name in the 'names' table
  • gender - text that identifies the gender
  • birthdate - integer that identifies the birth date in the 'dates' table
  • birthplace - integer that identifies the birth place in the 'places' table
  • deathdate - integer that identifies the death date in the 'dates' table
  • deathplace - integer that identifies the death place in the 'places' table
  • personimage - integer that identifies an image in the 'images' table
  • personImageArray - this is a bracketed list of identifiers of images in the 'images' table
  • OCCU - text that identifies the occupation
  • sources - integer that identifies the source of the persons data in the 'sources' table
  • note - integer that identifies the notes for the person in the 'notes' table

When fields are defined using the 'Database⇒Define New Field ...' function, and the Field Association option chosen is "Individual Field", the new field (except for combined fields) will appear as a column in the 'persons' table. Where the field type is :

  • Name field, it will be an integer that identifies a name in the 'names' table
  • Date field, it will be an integer that identifies a name in the 'dates' table
  • Place field, it will be an integer that identifies a name in the 'places' table,
  • Image field, it will be an integer that identifies a image in the 'places' table,
  • Photo Album field, it will be a bracketed list of integers that identifies a images in the 'images' table and
  • Text field, it will be simple textual information stored in the 'persons' table



The image below shows a listing of the families table. The columns in the table are:

  • families - an unique integer that identifies this family. (This value is used as the GEDCOM id of the family when exporting to GEDCOM, except it is prefixed with an 'F').
  • parent1 - integer that identifies the first of the two parents in the 'persons' table
  • parent2 - integer that identifies the second of the two parents in the 'persons' table<
  • offspring - this is a bracketed list of integers that identify the offsprint in the 'persons' table
  • marrdate - integer that identifies the marriage date in the 'dates' table
  • marrplace - integer that identifies the marriage place in the 'places' table
  • familyImageArray - this is a bracketed list of identifiers of images in the 'images' table
  • sources - integer that identifies the source of the family data in the 'sources' table
  • note - integer that identifies the notes for the family in the 'notes' table

As for 'persons', when fields are defined using the 'Database⇒Define New Field ...' function, and the Field Association option chosen is "Family Field", the new field (except for combined fields) will appear as a column in the 'families' table. Where the field type is :

  • Name field, it will be an integer that identifies a name in the 'names' table
  • Date field, it will be an integer that identifies a name in the 'dates' table
  • Place field, it will be an integer that identifies a name in the 'places' table,
  • Image field, it will be an integer that identifies a image in the 'places' table,
  • Photo Album field, it will be a bracketed list of integers that identifies a images in the 'images' table and
  • Text field, it will be simple textual information stored in the 'families' table




The names, firstnames and surnames Table

The names table has the following columns:

  • names - a unique integer that identifies this name.
  • title - an integer that identifies a title in the 'titles' table
  • firstName - an integer that identifies a title in the 'firstnames' table
  • firstMiddleName - an integer that identifies a name in the 'firstnames' table
  • otherMiddleNames - a bracketed list of integers that identifies names in the 'firstnames' table
  • surname - an integer that identifies a surnames in the 'surnames' table



The surnames table has the following columns:

  • surnames - a unique integer that identifies this surnames
  • names - text of the surname
  • count - number of times that this row is referenced by other tables

Genalogy Pro keeps a running tab of how many times the row is referenced by other tables in the 'count' column. For example, if the user change the surname 'Jackson' to 'Smith', Genealogy Pro will decrement the count to zero. When the count reaches zero the row is flagged for deletion from the table.



The name of a person can be accessed by using the SQL join statement, such as the following:

select p.persons as "Genealogy Pro ID" , s.names as "Last Name" , f.names as "First Name" from persons p left outer join names n on p.name = n.names left outer join surnames s on n.surname = s.surnames left outer join firstnames f on n.firstName = f.firstnames order by s.names, f.names ;

The output from this statement is depicted below.




The dates Table

The dates table has the following columns:

  • dates - a unique integer that identifies the date
  • day - integer representing the day of month. A value of '-1' means no day is known.
  • month - integer representing the month of year. A value of '-1' means no month is known.
  • year - number of times that this row is referenced by other tables. A value of '-1' means no year is known.
  • modifier type - (optional) identifies the type of date. This description is similar to the GEDCOM modifier. eg BEF = before


The date field of a person can be accessed by using the SQL join statement, such as the following:

select p.persons as "id" , s.names as "last name" , f.names as "first name", d.day as "birth day", d.month as "birth month", d.year as "birth year" from persons p left outer join names n on p.name = n.names left outer join surnames s on n.surname = s.surnames left outer join firstnames f on n.firstName = f.firstnames left outer join dates d on p.birthdate = d.dates order by d.year asc, d.month asc, d.day asc;

The output from this statement is depicted below (now using a genealogy pro file, generated from importing the example GEDCOM file).




The table_descriptions Table

This table is used by Genealogy Pro to gather information about the tables. It stores information about the order of the fields (for displaying in the Individuals' browser or Family browser) and the display name of the fields (as displayed in the browsers).