What is Soundex?
The Soundex code is an indexing system which
translates names into a 4 digit code consisting of 1 letter and 3 numbers. The most
familiar application of Soundex is its use by the US Bureau of the Census to create an
index for individuals listed in the US census records after 1880.
The advantage of Soundex is its ability to group
names by sound rather than the exact spelling. Take, for example, the name Naesmyth. A
census recorder or a particular family branch might spell their name variously as
Naesmyth, Nasmyth, Nasmith, Nesmith, Nessmith, Neasmith, etc. The Soundex code for all of
these is N253.
The existing indexes for the 1880, 1900, 1910,
and 1920 US federal census enumerations all use the Soundex code.
Soundex Rules
- All Soundex codes have 4 alphanumeric characters [no more, no
less]
- 1 Letter
- 3 Digits
- The Letter of the name is the first character of the Soundex code.
- The 3 digits are defined sequentially from the name using the Soundex
Key below.
- Adjacent letters in the name which belong to the
same Soundex Key code number are assigned a single digit. [See examples 2 and 3 below]
- If the end of the name is reached prior to filling 3 digits, use
zeroes to complete the code.
- All codes have only 4 characters, even if the name is long enough to
yield more.
The Soundex Key
| 1 |
B P F V |
| 2 |
C S K G J Q X Z |
| 3 |
D T |
| 4 |
L |
| 5 |
M N |
| 6 |
R |
| no code |
A E H I O U Y W |
Examples
Example 1 - NAESMYTH = N253
- The first letter of the name is the first part of the soundex code, N
- Vowels are ignored, so ignore A and E
- The next part of the code is from the letter S which
is assigned 2
- The next part of the code is from the letter M which
is assigned 5
- Vowels are ignored, so ignore Y
- The next part of the code is from the letter T which is assigned 3
- The resulting soundex code for NAESMYTH is N253
- Perform this excercise with your own spelling variation of Naesmyth.
You will find it is always N253.
Example 2 - BAIRD = B630
- The first letter of the name is the first part of the soundex code, B
- Vowels are ignored, so ignore A and I
- The next part of the code is from the letter R which
is assigned 6
- The next part of the code is from the letter D which
is assigned 3
- 3 numbers are required, but we are out of letters, so use 0
- The resulting soundex code for BAIRD is B630
Example 3 - CALLAHAN = C450
- The first letter of the name is the first part of the soundex code, C
- Vowels are ignored, so ignore A
- The next part of the code is from the letter L which
is assigned 4
- Two adjacent key letters are coded as one, so the second L is
ignored.
- Vowels and H's are ignored, so ignore A, H
and A
- The next part of the code is from the letter N which
is assigned 5
- 3 numbers are required, but we are out of letters, so use 0
- The resulting soundex code for CALLAHAN is C450
Example 4 - SCHULTZ = S432
- The first letter of the name is the first part of the soundex code, S
- Since C is in the same category as the S
preceding it, ignore C.
- Vowels and H's are ignored, so ignore H and U
- The next part of the code is from the letter L which
is assigned 4
- The next part of the code is from the letter T which
is assigned 3
- The next part of the code is from the letter Z which
is assigned 2
- The resulting soundex code for SCHULTZ is S432