7 min read

The GEDCOM File Format Explained

Inside a .ged file: how individuals (INDI), families (FAM), events and tags fit together, with a real annotated example. A plain-English tour of the genealogy file format.

A GEDCOM file can look intimidating the first time you open it in a text editor: a wall of numbers and short capitalised words. It is far simpler than it appears. Once you see the pattern behind those lines, the whole format opens up, and you will be able to read your own family tree in its raw form. Here is the plain-English tour.

Every line has the same shape

The trick to GEDCOM is that every single line follows one rule. It starts with a level number, then a tag, then sometimes a value. The level number shows how the line nests under the one above it, the way bullet points indent under a heading.

0 @I1@ INDI
1 NAME Wilhelm /Lopin/
1 SEX M
1 BIRT
2 DATE 1888
2 PLAC Vienna, Austria

Read that top to bottom. Level 0 begins a new record, here an individual. The 1 lines are facts about that person: their name, their sex, the fact that they were born. The 2 lines sit one level deeper, giving the details of the birth: the date and the place. The deeper the number, the more specific the detail. That is the entire grammar of the format.

Tags are just labels

The capitalised words are tags, short labels for what a line holds. A handful cover most of any file: INDI for an individual, NAME for a name, BIRT for birth, DEAT for death, MARR for marriage, DATE and PLAC for when and where, FAM for a family group. Notice that a surname sits inside slashes in the NAME line, which is how GEDCOM tells the family name apart from the given names.

People and families are kept separate

GEDCOM does something clever. Instead of writing a person's whole family into their record, it stores individuals and families as separate records and links them with pointers. A pointer is the little code in at-signs, like @I1@ for an individual or @F1@ for a family. Think of them as name tags that let records refer to each other.

0 @F1@ FAM
1 HUSB @I1@
1 WIFE @I2@
1 MARR
2 DATE 1912
1 CHIL @I3@

This family record does not repeat anyone's details. It simply says: the husband is individual @I1@, the wife is @I2@, they married in 1912, and their child is @I3@. Each person is written once and pointed to from wherever they are needed. That is why a GEDCOM can describe thousands of interlinked people without ever duplicating one, and why a good importer can rebuild your entire tree from these connections.

The header and the whole file

A real file opens with a short header (tagged HEAD) that records which program wrote it and which GEDCOM version it uses, then lists every individual, then every family, and closes with a TRLR trailer line. Between those bookends is your family, written in the simplest possible way: a list of people, a list of families, and pointers tying them together.

Why plain text is a feature, not a limitation

It would be easy to think a format this simple is old-fashioned. The opposite is true. Because a GEDCOM is plain text with an open structure, it can be read by any program on any computer, today and in fifty years, without asking permission from whoever made it. There is no secret format, no licence, no company that can switch it off. Your family history, written this way, is about as future-proof as a digital thing can be.

You will rarely need to read the raw lines yourself. A program like Dynasty House turns them straight back into a living, pannable tree. But knowing what is underneath is reassuring, because now you can see for yourself that the file holds everything, and that everything in it is plainly, permanently yours.

See your GEDCOM as a beautiful tree, free →

Ready to begin?

Free for your first 100 people. Import an existing GEDCOM in seconds. Your tree stays yours, always.

Found your house →

Keep reading