Dual Database Architecture:

  • The Dual Database Architecture refers to the Knowledge Base of identities separated from the federated agency-specific data.
  • The Knowledge Base (KB) is composed of two parts: the Knowledge-based Identity Management (KIM) system and a Trusted Identifier Manager (TIM).

Knowledge-based Identity Management (KIM) System:

  • Kept offline (powered down) unless actively processing data.
  • Able to keep track of multiple representations of the same person.
  • Enables higher accuracy in matching.
  • Protects privacy by re-using non-personal references in reports and studies.
  • Without a KB, PII must be pulled and transmitted for each data request.
  • Enables longitudinal research by providing a single path for data flow.
  • Limits the number of copies of identity data being transmitted to one.
  • Completely removes any agency data, creating a wall of separation.
  • Replaces personal information with a hashed cluster ID (KIM ID).
  • Once the data have been processed by KIM, the personal information is no longer needed for research requests or for matching across agency data.

Trusted Identifier Manager (TIM):

  • After receiving a KIM ID, TIM assigns an agency-specific research ID.
  • Agency records cannot be matched without approval from the agencies.
  • When records from two state agencies require linking, TIM creates a temporary crosswalk and a research-specific substitute ID.
  • The server that contains KIM/TIM is offline unless actively processing data.

Federated Agency-Specific Anonymized Data:

  • Research data is on a physically separate server from personal information.
  • Each agency retains ownership of its data, each agency has its own set of research IDs, and the agency data cannot be linked elsewhere without TIM.
  • The anonymized agency data is updated with KIM to allow longitudinal use.
  • In the event of a data breach, there is no connection between separate agencies’ records, and agency-specific IDs cannot be joined without TIM.
  • Full extracts of multi-year state agency data are no longer needed because recent updates are the only extracts required from state agencies.
  • Long-term studies can be supported and privacy protected using the consistent research IDs, rather than re-matching personal information.