Capitalize Names

TMG Utility is the essential TMG companion tool. If you use it, please make a donation.

advertisement

On this page: Description, Step by Step, Important Capitalization Notes

Description

Capitalize Names changes the capitalization of name fields. The function will capitalize any or all of the "display" name fields: Title, Prefix, Given, PreSurname, Surname, Suffix, and OtherName.

The term "capitalize" in this context means "change the name field to the usual mix of upper and lowercase letters." That is obviously not the same as "change the name to all capitals," which is what some people thought I meant! See the capitalization notes.

Step by Step

  1. Choose Capitalize Names from the function tree.
  2. Set each of the field name checkboxes for the fields that you want to modify.
  3. Set the Capitalize All Uppercase Only checkbox if you want to leave mixed-case fields as they are. The default is checked.
  4. Set the Flag Filter, if desired. TMG Utility will only change names for people who pass the filter.
  5. Click the [Capitalize Names] button. The program will inspect the name data and make changes as necessary.

Important Capitalization Notes

The program implements LCFL capitalization; LCFL is an abbreviation for "LowerCase Following Letter." Any letter that follows another letter is lowercased. Letters at the beginning of a field, or that follow a space or other non-alphabetic character, are uppercased.

Here are some examples of how the LCFL algorithm will change the case of names.

Input Output
SMITHSmith
?SMITH?Smith
SMITH-JONESSmith-Jones
ST. IVESSt. Ives

Four name fields support optional exception tables. Exception tables are lists of values that should be capitalized as shown in the file, rather than according to the rules implemented in the capitalization routine. Use the following file names: "surnames.txt", "givnames.txt", "titles.txt" and "suffixes.txt". To be processed, these files must exist in the same directory as the EXE file. At present, only two files are provided with the program. If you change an exception table, you must close and reopen TMG Utility or the program will not notice the change.

Surname fields receive special processing, as follows:

  1. Like all fields, the surname is checked against the exceptions table (surnames.txt). If the surname matches an entry in the table exactly, the table version is substituted for the database version, and we're done. Otherwise, we continue to #2.
  2. The database version is passed to the LCFL routine, which changes the name according to the LCFL rules described above.
  3. The result of the LCFL processing is checked for some common exceptions as described below. These exceptions are handled programmatically because putting all of them in the exceptions table would make the exceptions table quite large.
    • All surnames of the form "Mcxx" are changed to "McXx". The name must be 4 or more characters long.
    • All surnames of the form "Macxx" are changed to "MacXx". The name must be 5 or more characters long.
    • All embedded " Dit " are changed to " dit ".
    • All embedded " Van " are changed to " van ".
    • All embedded " De " are changed to " de ".

    As you might suspect, the rules above don't work for all names. This is why the exception table exists.

  4. The new version is substituted for the existing version.

The program includes a file of approximately 600 surname exceptions. I made this by running a program over a list of 32,000 surnames. Whenever the method above capitalized a surname differently than what was found on the table to begin with, I wrote that name to the exceptions table.

The program also includes a small list of suffix exceptions.

The program uses a pretty gross method to implement the exceptions tables. This is partly due to my laziness and partly due to the limitations of VB. VB is great for quick user interface programming, but bad for implementing anything other than the simplest data structures. It's no speed demon either, but in this case the resulting program is still many, many times faster than the manual alternative. (My best excuse so far--I'll try to think of more!)

Some TMG'ers use name fields differently than they were intended. Depending on how those fields are used, the program may or may not do the right thing.

I made sure that the program handled international characters that are defined as part of the extended ASCII processed by VB. I don't know enough about international character sets to know how well it will handle all the variations that exist. If you have names with international characters, please review them carefully after you use this routine.

If the Given name is modified, the program will compare the original input value against the SortGiven name part. If the original value matches the SortGiven field, the SortGiven field will be modified to keep the two values in sync. The same processing is applied to the Surname and SortSurname fields.