31 July 2013

Data Glut -- Are we creating a headache for our descendants and future historians?

Creative Commons, Attribution-NoDerivs 2.0 Generic

As I have felt increasingly overwhelmed with all the information out there – more and more FB posts, tweets, blogs, Google+ discussions, LinkedIn dialogues, images, videos and much more – I sometimes think that we are not doing those in the future any favors.

Apparently, someone else feels the same way as discussed in the article An Open Letter to the Historians of the 22nd Century: Sorry for all the stuff.

As family historians, we crave anything written by our ancestors – no matter how minimalist.  As the author says “...In contrast, future historians of my era (for whom this post is written) will have information, both useful and useless, sprayed at them with a fire hose. It’s worth thinking about how you future historians can sift through the flood of primary-source material ...”

And, though I am much better at organizing my images in folders labeled with dates and sometimes places, I do not re-label the individual image files and so I greatly identified with this statement, “Sorry about all of my files undescriptively labeled DSCimage987234534.jpg and GrantProposal2,docx. Sorry for the mess.”

With the plethora of “stuff” out there – what can we do to not overwhelm our descendants and future historians?

copyright © National Genealogical Society, 3108 Columbia Pike, Suite 300, Arlington, Virginia 22204-4370. http://www.ngsgenealogy.org.
Want to learn more about interacting with the blog, please read Hyperlinks, Subscribing and Comments -- How to Interact with Upfront with NGS Blog posts!
NGS does not imply endorsement of any outside advertiser or other vendors appearing in this blog.
Republication of UpFront articles is permitted and encouraged for non-commercial purposes without express permission from NGS. Please drop us a note telling us where and when you are using the article. Express written permission is required if you wish to republish UpFront articles for commercial purposes. You may send a request for express written permission to [email protected]. All republished articles may not be edited or reworded and must contain the copyright statement found at the bottom of each UpFront article.
Follow NGS via Facebook, YouTube, Google+, Twitter
Think your friends, colleagues, or fellow genealogy researchers would find this blog post interesting? If so, please let them know that anyone can read past UpFront with NGS posts or subscribe!

Suggestions for topics for future UpFront with NGS posts are always welcome. Please send any suggested topics to [email protected]


  1. Another problem will be that the researchers in the future will not be able to read any cursive writing let alone the old documents with the old style writing.

  2. So true Claudia ... my kids "type" everything ;-) ... Upfront with NGS talked about just this topic in this post, http://upfront.ngsgenealogy.org/2012/02/but-who-will-read-record-does-not.html

  3. Upfront with NGS reader Peter Bradish responds ...

    You asked "what can we do to not overwhelm our descendants and future historians?" Having a straightforward and easy to use method of recording our sources would help. Many of those I know who want to record their sources still don't do it. Their complaint, if they say anything at all, is that they know they should do it but it takes so long and is difficult.

    Elizabeth Shown Mills and others have put extensive effort into defining sources and have provided good examples. But it's still not easy for researchers to follow those definitions and examples to execute the recording of what they have found. In fact, it takes a noticeable amount of time just to decide on which source type to use since there are so many for some types of sources. For example is this a manuscript or a book, or something else when looking at an image or paper transcription of some early governmental record? Just trying to choose which source type for census records can take a fair amount of time, and then trying to be consistent with how you enter each item of data is even harder unless you are entering a lot of census records all at one time. And then there is all the time you spend keying the information into the computer. The joy of genealogy is in the research and finding ancestors, not in doing the paperwork for what you have found.

    Sources, pictures, letters and other valuable genealogical information will likely be lost if we don't find a simple way to record them. In fact, if it is simple, we are likely to gather and record many more records than if it is difficult and time consuming. So the high volume of genealogical data could significantly increase.

    FamilySearch Family Tree is the first serious effort I've seen to create a "holy grail" for genealogists, "One person, one record". It's taken 5 years or so to reach the quality of what they have now. Doing it this way will help relieve the "headache for our descendants and future historians" and not overwhelm them with lots of different records, all for the same ancestor.

    If they can do the same for sources as they are doing for people, "One source, one record", and make it easy for users to identify and save the record, they will make a seriously large dent in reducing the growth of redundant data. What if we had a repository of source records, for at least the more common types such that a researcher could cite each needed source by only having to fill in the variable data such as page number and other specific information? What if all the genealogy services we use for finding records of our ancestors could provide a standard source description that doesn't have to be identified for source type, and doesn't have to be manually keyed in by the researcher?

    Just my two cents worth ...

  4. You have touched on something that really resonates with me!

    I am not a professional but I try to be a "good" genealogist and give thought to what makes sense. I have been doing genealogy research for 40 years, on and off--I am in my early 60s. I have made and learned from a lot of the mistakes amateurs are prone to. With an ancestor named John Dean, one learns quickly not to make assumptions. I have found another ancestor under at least a half dozen spellings of the name, a couple of which are common denominators for multiple non-English names. And I know that county histories and even obituaries are only clues, not data.
    I am NOT technologically savvy, though I am learning since I retired 2 months ago. My point is that--since I am not a professional and only interested in my own trees--I often do not have much incentive to go to the original record, especially if the information I find in the secondary source (including quaking leaves) is consistent with what I already know. I have found errors in records from the old days so I view these with more skepticism, but I tend to assume that contemporary informants have personal information regarding the 2 generations before theirs, and therefore I assume (perhaps erroneously at times) that it is accurate. Nevertheless I would like to be accurate myself for others who may use my information, so when I document my sources I also need an easy way to say this is from an extracted list, or from an Ancestry tree, not the actual record.

    I will check out the FamilySearch Family tree, that was suggested. Thank you